Supporting heterogeneity in IoT gateway using Light-weight Semantic Web

Tracking #: 2553-3767

Hafizur Rahman
Dr. Md. Iftekhar Hussain

Responsible editor: 
Armin Haller

Submission type: 
Full Paper
Internet of Things (IoT) application depends on heterogeneous devices, sensors, protocols and technologies. This heterogeneity affects raw data generated by such devices. Due to data heterogeneity, achieving interoperability among different IoT devices and applications is very challenging. To interpret such raw sensor data into knowledgeable information and identify real-life events in this world, semantic web technology is highly adopted. Semantic web collects data from different sources, processes them and publishes through the web. Again, collecting data from heterogeneous environments for IoT application is highly demanding. It suffers from a significant amount of processing time and memory consumption for large-scale IoT. To address these challenges, we proposed a light-weight and heterogeneity support semantic web (Het-SW) in the IoT gateway. Our proposed scheme supports heterogeneous devices/data and provides a scalable solution for a large number of IoT devices by implementing sensor virtualization technique. Simulation results show that Het-SW outperforms XGSN and L-SW in terms of data heterogeneity, scalability, reliability and response time. Even after giving support of heterogeneous data to Het-SW, the operating cost is satisfactory for resource constraint IoT gateway.
Full PDF Version: 


Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 15/Oct/2020
Review Comment:

The authors investigate how the Semantic Web technologies can be used to achieve semantic interoperability in the IoT. They propose a approach called Het-SW where an IoT gateway collects heterogeneous data from sensors, structures it, then annotates it semantically. The LiO-IoT ontology, proposed by the authors in a previous work, is used to semantify the data. They report on an experiment on simulated data generated by a varying number of nodes that generate varying amounts of data in different data formats .

This manuscript was submitted as 'full paper' and should be reviewed along the usual dimensions for research contributions which include (1) originality, (2) significance of the results, and (3) quality of writing.

In my opinion, the manuscript falls really short of the minimum quality for each of these dimensions, and should clearly be rejected.

1. Originality. The paper breaks no ground. The related work is old and very limited. Only 11 references to scientific articles. Many research on semantic interoperability were proposed in the past and are ignored in this paper. See at the bottom of this review.

2. Significance of the result. The proposed solution Het-SW is compared to only two other solutions called XGSN (proposed in 2014), and L-SW. The latter is not even introduced in the manuscript. The experiment is poorly described and reported. There is no explanation of what kind of data is generated by the IoT devices, where the heterogeneity is, or how the data is converted. The system and the experiment respect none of the FAIR guiding principles for research. See

3. Quality of the writing. The writing is very poor, there are many grammar errors and typos throughout the paper.

Below is a short list of reactions to the many "surprises" I had while reading the paper.

- "Semantic Web collects data from different sources" (abstract). Semantic Web is not an agent. Semantic Web is used as an agent in many sentences.
- Fig 1. I don't consider that discussing "trends" is science. The figures is extracted from a paper published in 2014. The "trend" might have changed since.
- p.2, 2nd column lines 14-20. Clumsy description of what triples "triplets such as subject, predicate, and object". Then: "one can deduce [a triple] by applying the knowledge in a transitive property". The authors should consider that the audience of the Semantic Web Journal, especially the reviewers, are aware of Semantic Web technologies.
- Many claims would need a reference. For example in the first column of page 3.
- p4. lines 17-19. What do you mean by "Most of the current semantic models request excessive processing time and are not suitable for the IoT" ?!
- With CoAP it is not always true that "information come back and is displayed in HTML or RDF format"
- Copper (Cu) is just one example of a CoAP user-agent. The approach could use any other CoAP client.
- Similarly, Mosquitto is just one example of a MQTT broker implementation.
- The .wellknown/core entrypoint and the CoRE data format are specified in CoRE Resource Directory (IRTF T2T draft-ietf-core-resource-directory-23) and CoRE Link Format (RFC 6690), respectively. Not in CoAP (RFC 7252).
- The port of a CoAP server is not necessarily 5563.
- The IP of a CoAP server is not necessarily IPv6.
- It is not true that the ct attribute in a CoRE link format document should be used to store the type and value of an observation. ct is intended to provide a hint about the Internet media type a resource returns
- In Fig 3, why is it just MQTT, why are the arrows both ways.
- only the AVG data aggregation is considered ? It is not clear why the authors use some sort of k-clustering (unspecified) on sensors before computing the avg for these clusters. Algorithm 1 is neither correct, nor useful.
- Most of the components in the JSON file of Fig. 4 are unexplained. As such, the figure is useless.
- "fog" is first used p.5 line 42. It is never explained.
- The Algorithm 2 is not correct and useful. The loops are not ended. The variables are not explained. What is N, M ?
- The quality of hte ontology in Fig 8 is very questionnable. How can Coverage be a sub-class of Polygon, Circle, and Rectangle, at the same time ? How can a Device be a sub class of Actuator and Sensor at the same time ?
- There is a more recent version of SSN than the one used. See
- What is the number of tuples in the x-axis of Fig 10 and 11 in the evaluation ? What is 1 and 2 applications ?
- The different data formats (XML, CSV, JSON, SQL (?), TXT) are text-based and should not be considered as data formats for the IoT. There are data formats more suitable for the IoT such as EXI or CBOR.
- The claims that the model is better than the two others are questionnable. For example XGSN could very well be adapted to use IPv6, so the claim that Het-SW is more scalable because it uses IPv6 is irrelevant. L-SW is never explained.
- the main characteristics of XGSN and L-SW should at least be explained.

A simple look up on google scholar with the keyword in the title gets (random selection):

Fortino, G., Savaglio, C., Palau, C. E., de Puga, J. S., Ganzha, M., Paprzycki, M., ... & Llop, M. (2018). Towards multi-layer interoperability of heterogeneous IoT platforms: The INTER-IoT approach. In Integration, interconnection, and interoperability of IoT systems (pp. 199-232). Springer, Cham.
Balakrishna, S., & Thirumaran, M. (2019). Towards an optimized semantic interoperability framework for IoT-based smart home applications. In Internet of Things and Big Data Analytics for Smart Generation (pp. 185-211). Springer, Cham.
Medvedev, A., Hassani, A., Zaslavsky, A., Jayaraman, P. P., Indrawan-Santiago, M., Haghighi, P. D., & Ling, S. (2016, November). Data ingestion and storage performance of iot platforms: Study of openiot. In International Workshop on Interoperability and Open-Source Solutions (pp. 141-157). Springer, Cham.
Čolaković, A., & Hadžialić, M. (2018). Internet of Things (IoT): A review of enabling technologies, challenges, and open research issues. Computer Networks, 144, 17-39.
Sharma, R., & Sharma, A. (2019, November). A Review on Interoperability and Integration in Smart Homes. In International Conference on Futuristic Trends in Networks and Computing Technologies (pp. 116-128). Springer, Singapore.
Datta, S. K., Bonnet, C., Baqa, H., Zhao, M., & Le-Gall, F. (2018, June). Approach for Semantic Interoperability Testing in Internet of Things. In 2018 Global Internet of Things Summit (GIoTS) (pp. 1-6). IEEE.
Ullah, F., Habib, M. A., Farhan, M., Khalid, S., Durrani, M. Y., & Jabbar, S. (2017). Semantic interoperability for big-data in heterogeneous IoT infrastructure for healthcare. Sustainable cities and society, 34, 90-96.
Rhayem, A., Mhiri, M. B. A., & Gargouri, F. (2020). Semantic Web Technologies for the Internet of Things: Systematic Literature Review. Internet of Things, 100206.
Al-Osta, M., Bali, A., & Gherbi, A. (2019). Event driven and semantic based approach for data processing on IoT gateway devices. Journal of Ambient Intelligence and Humanized Computing, 10(12), 4663-4678.
Haller, A., Janowicz, K., Cox, S. J., Lefrançois, M., Taylor, K., Le Phuoc, D., ... & Stadler, C. (2019). The modular SSN ontology: A joint W3C and OGC standard specifying the semantics of sensors, observations, sampling, and actuation. Semantic Web, 10(1), 9-32.

Review #2
Anonymous submitted on 08/Nov/2020
Major Revision
Review Comment:

L-SW is mentioned but there seems to be no reference to it.
Mosquitto is mentioned without any reference.
Lio-IoT ontology resembles IoT-lite quite significantly, with matching nomenclature for some concepts. Best practices in ontology development would be to re-use existing concepts and extend them with new concepts which would constitute the authors ontology.
It is unclear how the proposed system supports mobile devices and the other don’t. Please clarify. Below is a publication about how GSN supports mobile devices.
C. Perera, A. Zaslavsky, P. Christen, A. Salehi; D. Georgakopoulos., “Connecting Mobile Things to Global Sensor Network Middleware using System-generated Wrappers”
Regarding scalability, reliance only on IPv6 does not bring any contribution here. The application of virtualization is not clear. What is specifically meant by virtualization here? If it is with regards to sensor virtualization, doesn’t XGSN already support this?
How is reliability measured when the setup is simulated?
Examples such as below would be a good contribution to the SOTA, and as a comparision to the proposed system. The Related work in the this field is not covered well, and .
B. Cheng, G. Solmaz, F. Cirillo, E. Kovacs, K. Terasawa and A. Kitazawa, "FogFlow: Easy Programming of IoT Services Over Cloud and Edges for Smart Cities," in IEEE Internet of Things Journal, vol. 5, no. 2, pp. 696-707, April 2018, doi: 10.1109/JIOT.2017.2747214.
Lanza, J.; Sánchez, L.; Gómez, D.; Santana, J.R.; Sotres, P. A Semantic-Enabled Platform for Realizing an Interoperable Web of Things. Sensors 2019, 19, 869.

Review #3
Anonymous submitted on 09/Nov/2020
Major Revision
Review Comment:

This paper proposes a light-weight gateway architecture aiming to support heterogeneity in the IoT data using semantic web technologies. Further authors proposed a sensor virtualization technique withing the gateway.

The paper is well structured and written clearly except for minor errors, yet authors need to pay more attention to the referencing, overall flow and the relevancy of the content. For example, in the introduction, some statements are missing references. Further, I would propose to make the related work section a bit more focus and relevant. For example, the authors discuss SSN, IoT-Lite yet didn't compare or contrast it with regard to their proposed methodology (or with the author's ontology). Otherwise, if the ontology is not the main focus it is better to emphasis on the existing gateway architectures and semantic web of things approaches and discusses the problem. With that, you could highlight the limitations of existing approaches and illustrate the gap more clearly. As this is a vastly researched area, I think this is very important to highlight the originality and novelty of the approach. So related work section should have been planned more to build a logical flow to bring the reader’s attention to the problem.

I propose to refer to more literature to understand some of the existing work that you have missed mentioning. For example, Gyrard A. et al's work, Ruta M., et al's work such as "Enabling the Semantic Web of Things: framework and architecture", Kotis K., et al. paper on "Semantic Interoperability on the Web of Things: The Semantic Smart Gateway Framework" and some recent work such as Botonakis S., et al's iSWoT, Patel P. et al's SWoTSuite etc. These would have been more relevant literature on the topic.

Reference of LiO-IoT ontology comes later in the paper where I was initially confused when reading the explanation of the proposed gateway architecture. If ontology is publicly available please add a link to the resource via a footnote. The snapshot of the file provided to show the JSON to LiO-IoT mapping does not provide a clear picture for a reader who doesn't know the ontology. Thus, a consider describing them in detail otherwise providing the mapping files with an access link.

I was interested in reading the JSON to RDF mapping method that you have utilised. I assume this has a considerable weight when solving the heterogeneity in IoT data according to your proposed architecture. It would have been better if you have discussed further on the mapping of JSON into RDF format. A more generic solution for mapping indeed would be great in terms of reducing the extensive processing of data. Potentially, if JSON data output from a sensor could be streamed straight to generate knowledge graph according to the LiO-IoT schema, that would help to reduce some layers in your architecture. If you are interested Sergio J. et al's work, "J2RM" talks about a potential pathway for this type of mapping.

Authors seem to have put a lot of effort into integrating different protocols such as CoAP, MQTT to receive data from various sensors and actuators. This is a good effort. However, in my opinion, you should have put more effort into generating heterogeneous data (which of course you have already done since have considered data from mobile and multiple sensors). Otherwise, you are putting more emphasis on the engineering tasks instead of putting weight on generalizing the framework or emphasizing the main contribution. Further, I am not entirely sure why you would have an additional virtual sensor layer in the gateway. I read that this is one core component of the paper. So my suggestion is to expand on it a bit further.
Wouldn’t it be better to feed the data streams (from a sensor/actuator) directly into to the formation and structuring layer and then generate the required RDF triples (with JSON -> RDF)? You can attempt to aggregate, annotate the data as the as required in the formation layer. This is a thought that came to my mind when I go through the proposed architecture. So I think emphasizing more on this point would help to highlight the novelty in your approach.

I think this an interesting area which needs more research. As the authors mentioned there's a growing trend in this area with the proliferation of smart devices. However, in this paper, I see less emphasis on the word lightweight. How is it lightweight? You have provided comparisons with existing frameworks yet it did not highlight how your virtualization layer makes the difference. Further, you have alternate tuples with data packets. Here I was again a bit confused. What is the relationship between a tuple and data packet? Page 12 line 38.

As per the conclusion, you have suggested three contributions. I agree that these are all required improvements. However, my understanding is, the paper needs a bit more reshaping to highlight the main focus. Otherwise, discuss how each of the contributions that you have mentioned in the conclusion section is addressed with your proposed architecture rather focusing more on the engineering aspect of the system. That is how LiO-IoT makes the architecture light compared to existing literature (Weigh the discussion more on how the LiO-IoT ontology helps to provide a quick response of the requested user query as you have stated in the conclusion), how sensor virtualization makes it scalable and what are the pros and cons this layer brings in terms of performance and other required parameters.

Minor comments

You have interchangeably used the word lightweight ontology and LiO-IoT ontology. I assume you mean the same thing. So to be consistent, I propose to stick to one terminology throughout the paper.

Page 1 line 50: I think it should be *shared vocabularies instead of "share vocabularies" and *important role, instead of "important rule".

You haven't defined the abbreviation SWoT before using it. Inline 43 you can first define it before using.

Page 8 line 32: I think this should be RDF triple, not triplet.

Page 12 line 35: "From Figure 11 (10 and ??)" Not clear what this means.