Review Comment:
This paper is a joint review of Kai Eckert and Benjamin Schnabel. Benjamin Schnabel is a PhD student in the field of Digital Humanities (Jewish Studies).
In this paper, the author describes the “Sampo Model”, an informal collection of principles for LOD publishing. It is an extended version of a DHN 2020 conference (poster) paper. The extension is indeed substantial and therefore a valid contribution to the SWJ.
The author lists six principles for LOD publishing: 1. Support collaborative data creation and publishing, 2. Use a shared open ontology infrastructure, 3. Support data analysis and knowledge discovery in addition to data exploration, 4. Provide multiple perspectives to the same data, 5. Standardize portal usage by a simple filter-analyze two-step cycle, 6. Make clear distinction between the LOD service and the user interface (UI).
As emphasized in the discussion, none of these principles are new, but it is still valuable to describe them together in context. Besides a short explanation, for each principle examples from the various LOD portals using the Sampo model are provided.
In a sense, the paper is a retrospective of the ongoing work for the past 19 years where many LOD portals (mainly in Finland) have been developed, also with a focus on interoperability.
The principles are meant to be seen as an extension to well-known ideas and requirements such as the four LOD principles and 5-star linked data. It is not so much about why we need LOD but on how it should be done. Nevertheless, there are more recent developments where LOD provides immediate advantages, such as for further analysis of the data. There is a need to have analytic tools and automatic knowledge discovery.
This paper does not go into further technical details. While it is meant as an overview paper, some more details on the technical setup and the commonalities and differences between the different Sampo instances would be very interesting.
While it is acknowledged that the principles themselves are not new, it should also be stated that all of the principles are certainly also implemented in many LOD projects apart from the Sampo portals.
Regarding the first principle, the author remains very vague about how the data should actually be created. It is assumed that a knowledge graph is already available to be published. While such a scope limitation of course makes sense, at the same time the principle does not help for publication beyond the well-known LOD principles. Similarly, the second principle does not really provide new insights beyond data and ontology reuse and interlinking. We would argue that both principles are simply the foundation for the following principles which focus actually on the publishing aspect beyond plain data publication and a SPARQL endpoint.
The third principle acknowledges a change in user requirements towards data portals. While earlier, a simple search and access of the data was sufficient, today, the data should be easy to analyze and to combine with further data sources. Unfortunately, here, very little information is provided as to how this can be achieved.
The fourth and fifth principles emphasize the importance to curate data for the user instead of merely simply putting it online and to provide simple, standardized tools so that the users can explore prepared and own questions on their own.
At last, the sixth principle is about frontend-backend separation. Here, the author only mentions SPARQL as a backend API. This is not sufficient in all, not even many, cases. While SPARQL is rightfully seen as the lingua franca in the Semantic Web, many application developers prefer dedicated APIs and additional services such as a search index (Elastic, Solr, etc.) to speed up application development and the resulting applications.
Overall, the paper is well-written and easy to understand. The scope is a mixture between a research paper about the identified principles and a retrospective on close to 20 years of LOD portal development. As an application report, the paper is certainly acceptable and provides interesting and actionable insights for similar projects.
On the other hand, with this mixture, the paper falls somewhat short both as a research paper and the retrospective. It would actually be interesting to get more insights into how these principles evolved over time and what other lessons have been learned. The presentation of the principles should address earlier and more clearly its relations to the existing principles (4 principles, 5 stars, etc.) and how they deal with their shortcomings, i.e., the questions that arise once the LOD is in place. We are certain the paper would benefit from rather minor adjustments and clarifications regarding the above mentioned issues.
Minor issues:
“shared publishing infra, ” probably infrastructure is meant.
Footnote to dumb down principle should point to https://www.dublincore.org/resources/glossary/dumb-down_principle/.
p. 4: ahead to is based on, superfluous “to”.
p.4: mdel, instead of model.
perspectiveS in principle P4.
p.1 WarWictim, should be WarVictim.
many more typos, please ensure proper proof-reading.
|