PhD Defence | Building Stream Reasoners for the Web of Data
Hamid Bazoubandi’s thesis explores the design and implementation of Stream Reasoners for the Web of Data. Hamid completed his research under the supervision of prof. Henri Bal, prof. Frank van Harmelen, and dr. Jacopo Urbani (all from VU Amsterdam).
The rapid expansion of the Internet of Things (IoT) has generated a massive volume of streaming data and has amplified the demand for real-time decision making and action-taking. However, the volume and velocity of streaming data exceed any human operator’s processing capacity. Hence, we need computer systems that process large volumes of streaming data in (quasi)-real-time and can automatically identify implicit relationships between pieces of data to conclude additional knowledge. Stream Reasoning is a relatively young research area that combines stream processing with semantic-web technology to achieve this goal.
Although Stream Reasoning has witnessed considerable progress since its inception, several challenges still need further studies. For instance, because of the high computational complexity of reasoning algorithms, there is an inherent conflict between the speed and the depth of reasoning to find implicit knowledge; namely, the more in-depth the search for tacit knowledge is, the longer the processing will take. For a system that must operate under tight time constraints, finding the right balance is crucial. The second challenge is that stream reasoners must understand the meaning of relationships among entities in the data to conclude implicit knowledge. Therefore, the ontological data that defines relationships among entities is of paramount importance in stream reasoning. Unfortunately, this ontological data is seldom streamed along with the rest of the data. Hence, the reasoner (or the human operator) must download and include the data separately from the Web. The challenge is that the Web is a highly dynamic and unsupervised environment where sources publish data independently of each other, and the data may (dis)appear or change silently at any moment. Moreover, the data on the Web may contain inconsistencies or mistakes. All these pose significant challenges to stream reasoning.
In this thesis, Hamid builds the Laser, a stream reasoner that supports a powerful logic that enables in-depth reasoning over streaming data. Furthermore, Hamid shows that by using heuristics and novel optimization techniques, Laser outperforms the state-of-the-art. Furthermore, Hamid studies the effects of the background ontological data on stream reasoning and shows that the background ontological data significantly boosts the amount of implicit data that reasoners can conclude from the data. However, he also identifies the challenges that can arise from this, e.g., the large percentage of missing data and broken links and conflicting ontological data that can fail the reasoning process.