Challenge-derived design principles for a semantic gazetteer for medieval and early modern places

Tracking #: 2066-3279

Authors: 
Philipp Schneider
Jim Jones
Torsten Hiltmann
Tomi Kauppinen

Responsible editor: 
Christoph Schlieder

Submission type: 
Survey Article
Abstract: 
In recent years gazetteers based on semantic web technologies were discussed as an effective way to describe, formalize and standardize place data by using contextual information as a method to structure and distinguish places from each other. While research concerning semantic gazetteers with regard to historical places has pointed out the importance of enabling the creation of a global and epoch-spanning gazetteer, we want to emphasize the importance of taking a domain oriented approach as well -- in our case, focusing on places set in medieval and early modern times. By discussing the topic from the historians’ perspective, we will be able to identify a number of challenges that are specific to the semantic representation of places set in these time periods. We will then do a survey of existing gazetteer projects that are taking historical places into account. This will enable us to find out which technologies and practices already exist, that can meet the demands of a gazetteer that considers the time specific geographic, social and administrative structures of medieval and early modern times. Finally we will develop a catalogue of design practices for such a semantic gazetteer. Our recommendations will be derived from these existing solutions as well as from our epoch-specific challenges identified before.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Muriel van Ruymbeke submitted on 08/Mar/2019
Suggestion:
Minor Revision
Review Comment:

This manuscript was submitted as 'Survey Article' and should be reviewed along the following dimensions: (1) Suitability as introductory text, targeted at researchers, PhD students, or practitioners, to get started on the covered topic. (2) How comprehensive and how balanced is the presentation and coverage. (3) Readability and clarity of the presentation. (4) Importance of the covered material to the broader Semantic Web community.

The paper is suitable as introductory text, targeted at researchers, PhD students, or practitioners, to get started on the covered topic. But some points still need to be tackled (see below). In general, the presentation and coverage are comprehensive and well balanced, with some points to improve:
a.section 2: basic concepts to structure historical place and spatial data. The basic concepts should be more emphasized
b.Conclusion: last sentence says: “…we have developed a catalog of design practices…”.

The design practices should be more highlighted. I think the Reviewer 1 main request is not yet completely tackled.
c.Bibliographical references: there is no publishing year?
d.Typo problem at the beginning of section 4.2

The presentation is readable and clear, with 2 exceptions:
e. Fig. 4 is unreadable
f. Need of an UML schema in section 4.2

The covered material could be of interest for the broader Semantic Web community. Indeed, the paper depicts the problem encountered to deal with uncertain, incomplete, fuzzy and contradictory information. Solving the described difficulties could help to enrich the semantic aspect of data in Semantic Web in general.

Specific questions or requests to Author:
1)Introduction, part 3: the clarification given to reviewer 1, relative to technical level, and using 3 dimensions to distinguish places from other ones introduces a new confusion: It is difficult to understand the difference between the “meaning” and the construction of the same meaning by “human experience”

2)Introduction part 5: “For historians of the pre-modern eras, it is less important where the places describes in historical sources are geographically located”. And later, in part 6: “19th, 20th and 21th centuries – epochs which seem better suited for cartographic depiction of the world”. These sentences should be explained and rested on solid arguments or suppressed.

3)In section 3 and 4: work could be enriched maybe by reading : Niccolucci, F., & Hermon, S. 2016. Representing gazetteers and period thesauri in four-dimensional space–time. International Journal on Digital Libraries 17(1): p.63–69.

4)Section 4 part 2 : CIDOC implemented is also available on CIDOC-CRM website (http://new.cidoc-crm.org/versions-of-the-cidoc-crm) where entities and properties have the original URI

5)Section 5.3: last sentence: “… a new model has yet to be developed…” my question: from scratch? or as an extension of existing model. In this case witch one?

6)Section 5.4: Why not considering OWL-Time?

Review #2
By Gerald Hiebel submitted on 09/Apr/2019
Suggestion:
Accept
Review Comment:

This manuscript was submitted as 'Survey Article' and should be reviewed along the following dimensions: (1) Suitability as introductory text, targeted at researchers, PhD students, or practitioners, to get started on the covered topic. (2) How comprehensive and how balanced is the presentation and coverage. (3) Readability and clarity of the presentation. (4) Importance of the covered material to the broader Semantic Web community.

The paper gives a comprehensive description of the challenges building a historical gazetteer and covers the the available approaches. Thus it is suitable as a “Survey article”. In addition the topic is clearly presented.
Building historical gazatteers is a typical use case for Semantic Web methodologies as other approaches are not able to cover the complexity of the phenomenon. Moreover places are of high importance to any Linked Data application and historical places are essential for the Cultural Heritage and Digital Humanities Community.
Most of the reviewers comments from the first version were well addressed and it was important to make these changes.
There are two comments that were not sufficiently addressed in my opinion:
Page 5: 3.3. Hierarchies
I agree with Martin Doerr that a representation of place relations through hierarchies should at least be put alongside a representation in a network to cover the phenomenon in its complexity. We have place relations on a semantic and on a geometric level that may have lots of different topological relations (CRMgeo adds to CIDOC CRM all the topological relations available in GeoSPARQL through the linking of the two ontologies). The identity of places is crucial for their possible relations. If they are in the CRMgeo sense phenomenal (derived from events or physical objects) or declarative decides the type of relations that may exist between them. It should at least be mentioned that a network representation could solve some of the conflicts that arise from a hierarchical representation.

Page 10-11:4.1. CIDOC CRM
When describing the SP2 Phenomenal Place it is worth mentioning that a SP2 can derive its identity either from a temporal entity (E4 Period or E5 Event) or a Physical Object (E18), because these identity criteria and the representation of the respective E4 or E18 in the gazetteer would help a lot in the disambiguation necessary in a historical gazetteer and the topological relations (temporal and spatial) between entities.
As E4 Period and E18 Physical Object are subclasses of E92 Spacetime Volume they inherit the property “P161 has spatial projection”. This representation would be preferably to the “P7 took place at” representation as it states the identity of the place and thus E4 Periods can be related with their respective properties, which can be causal ones as well.
In Figure 1, E18 Physical Object should be included as a subclass of E92 Spacetime Volume too, in order to represent this dual identity criteria for places.

Review #3
By Carsten Keßler submitted on 10/Apr/2019
Suggestion:
Minor Revision
Review Comment:

This revision of the article is substantially improved compared to the original submission. It serves much better as a survey now, particularly as an introductory text on historical gazetteers. Having said that, I think some improvements could still be made to improve the article as a survey:

- As indicated in my first review, I would still like to see an argument what exactly is unique about medieval and early modern places; put the other way around, it would be useful to see which design practices can also be applied to historical gazetteers focusing on other periods.
- At the end of section 2, a mentioning of qualitative spatial reasoning plus some references could be useful.
- (In my first review, I had also suggested to add some references, none of which has been added – even though the response letter by the authors claims that one of them has been added. I think these should still be added.)
- On p.9, right column, the paper says: "Of course this list is far from complete." For a survey, this is somewhat problematic; I think the gazetteers, ontologies etc. that are note discussed in detail should at least be mentioned with references.
- Subsection 4.6 should be extended with an overview table that shows the challenges listed in section 3 and briefly states if and how the different gazetteers and ontologies discussed in section 4 address those challenges. this would add a lot of value for readers of the survey.

There are also a couple of typos that still need to be fixed:

- the "extent" is spelled "extend" several times
- p.2, right column:
- 21th ⇢ 21st
- adapt a domain-oriented focus ⇢ adopt
- p.3, left column
- adapted for this specific domains ⇢ these
- we are able to get an overview over ⇢ overview of
- p,3, right column:
- the place’s position on earth ⇢ on Earth
- Suburban areas outside the walls, can also... ⇢ remove comma
- p.5, right column:
- To better grasp this challenges about multiple assertions ⇢ "this challenge" or "these challenges"
- p.6, right column:
- In the Middle Ages, ‘ruling’ did it not mean ruling over space ⇢ did not mean
- p.8, right column:
- The CIDOC CRM [34] is a heavy weighted ontology, conceived for managing items collected by museums and tracking its provenance ⇢ The CIDOC CRM [34] is a heavy-weight ontology, conceived for managing items collected by museums and tracking their provenance
- P.10, left column:
- Firstly, the use of the classE4 Period. ⇢ class E4 Period
- P.13, right column: Because the project’s definition is based on Yi-Fu Tuan’s experienced based approach to places ⇢ experience-based approach
- P.14, right column:
- This also prevents redundancies because the concepts of an epoch has to be defined only once. ⇢ the concept
- Last paragraph: No need to introduce the GOV abbreviation again
- P.15, left column:
- fiat concepts represent men-made virtual units ⇢ man-made
- P.18, left column:
- In most of the cases historical text report events ⇢ historical texts