Review Comment:
The paper presents an ontology for representing folktales based on Vladimir's Propp theory and an application of the ontology to the description of traditional tales from Sub-Saharan Africa and the Indian state of Kerala. The authors used the ontology to build an information extraction system for Propp functions in folktales. Then, they populated the ontology and performed an analysis of the resulting knowledge base, drawing some interesting conclusions about the narrative structure of such tales.
The authors' study is relevant to the current special issue. I found the results interesting, and I appreciated the authors' focus on non-Western tales since these kinds of texts are often overlooked in Western-centric Digital Humanities.
However, some major issues need to be resolved. Indeed, in my opinion, the authors should provide a better introduction to the problem they aim to solve, and more detailed justifications of their methodology and modelling choices. The related works section should be expanded. Some sections of the paper are not so clear, and in some cases disconnected. The implementation section contains some many technical details that are not so important for the paper. Finally, the conclusions could be restructured to give a better overview of the work done by the authors.
A significant issue that hindered my evaluation of the paper is the fact that the website of the ontology is currently offline, making it impossible to access the ontology and the web application. I kindly ask the authors to restore the website.
In the following, I report the issues I identified and suggest some edits and corrections that may improve the paper.
== Introduction ==
1. The name of the ontology is not stated anywhere. The first mention of "ProppOntology" is in the Related Works section on page 4. The introduction refers to "this project", but the name of the project is not stated anywhere.
2. The introduction lacks a detailed problem statement. I suggest that the authors move to the introduction the first two paragraphs of section 2.3, which explain more clearly what is the problem, why they are working on it and how they plan to solve it.
3. It would also be helpful to specify in the introduction what language the ontology is written in, e.g. RDFS or OWL, providing an appropriate reference and explaining the acronym (e.g. "Web Ontology Language (OWL) [99]")
4. How should the expressive power of an ontology "grow with the number of annotations"? The expressive power of an ontology generally depends on the classes, properties, and axioms that it contains, not on the number of individuals that populate it.
== Description of the Domain ==
1. At the end of section 2.2, the discussion of the tale is long and specific. If the authors want to use this tale as an example, to facilitate the understanding of the reader, it should be better if they provided at least a short summary of the storyline.
2. In the last sentence, the authors state "nor is there only one correct sequence of functions per tale". I agree with this view, but in my opinion, the example presented in section 2.2 is not enough to prove this statement true. It would be helpful if the authors added a reference to a scholarly work discussing this issue.
3. At the end of page 3, the authors discuss implementation details such as vocabularies and annotation properties. This seems out of place and should be moved to section 5. Furthermore, the references for RDFS and OWL Annotation Properties should be provided.
== Related Works ==
1. This section could be extended and improved describing other ontology-based approaches to narrative representation.
2. The authors state that in their ontology the class Move is a subclass of Tale "because a tale consists of one or multiple moves" but in general a part-of relation is very different from a subclass relation. If Move is a subclass of Tale, then it can be inferred that each Move is a Tale. Is this what the authors intend?
3. Based on the provided references, it appears that the ProppOnto ontology was authored by Peinado [13], not by Declerck [15] as stated by the authors (also in section 5.4). In fact, reference [15] does not mention ProppOnto at all. Could the authors check? Furthermore, the authors should make it clear that ProppOnto and ProppOntology are two different ontologies because given the similarity of the names it is very easy to get confused.
4. The sentence "the Internationalized Resource Identifier (IRI) is a short description of the function according to the corresponding literal" is not clear for me because: (i) an IRI is simply an identifier, not a description, and (ii) "corresponding literal" is very vague: which literal? corresponding to what?
6. To better explain the authors’ methodological choices, it would be better to justify why the authors chose to import the Family Ontology by Koleva [18], instead of other more common genealogical ontologies.
7. When discussing the work by Koleva [18], the authors state that Koleva "used SWRL rules for the classes" but, generally, SWRL rules are applied on variables that represent individuals. Could the authors better explain this point? Furthermore, the authors state that Koleva's approach works "comparatively well", but what is the term of comparison?
8. When introducing the term "verbalisation", the authors should provide a definition of it.
== Design ==
1. I suggest the authors to better describe and add motivations for adopting an ontology-based approach, indeed the authors don't mention the main advantages of using ontologies, e.g. standardization and interoperability.
2. When the authors mention "RDF model", do they mean "OWL model"? Indeed, RDF does not have a concept of class (first introduced in RDFS), nor a distinction between object property and data property (first introduced in OWL).
3. The idea of representing each Proppian function as both a class and a property seems strange to me. Generally, a class and a property are two different things and a resource cannot be both a class and a property. In my opinion, to motivate and clarify their approach, the authors should answer the following questions: (i) are the "function class" and the "function property" two different resources with different IRIs? (ii) are they connected to each other, and if so how? (iii) why not simply connect each character to the function it appears in, instead of defining "function properties"? (iv) why not use reification to achieve the same goal?
4. The authors state that "in real life the classes of humans and animals would certainly be distinct". However, according to biology, Human is a subclass of Animal, therefore they cannot be distinct.
5. In Figure 1, the names of object properties begin with an uppercase character. This is in contrast with standard practice in the Semantic Web field, which is to use a lowercase character.
6. The description of object properties and data properties is very short and it should be expanded, providing a list of the main properties of the ontologies if possible.
7. At the end of page 8, the sentence "the function Reconnaissance ϵ1 applies" is unclear (at least without looking at the ontology).
8. Also at the end of page 8, the sentence "since they have been included before the Family Ontology was imported" is not so clear for me: is there a reason for not doing it now?
9. On page 9, when discussing the work of the annotators, it would be helpful to state who are these annotators (are they experts in the field or not?). Furthermore, was each tale annotated by only one annotator, or more than one?
== Implementation ==
1. In general, the implementation section contains technical details that can be considered not relevant in this context, e.g. about the security of the platform, which can be removed.
2. "most modern web applications are developed using programming languages like PHP or Ruby" -> please provide a reference.
3. The authors should provide a reference, or at least a footnote, when introducing the Apache Jena Fuseki software.
4. "the Webprotégé instance is not directly connected..." -> how is the Webprotégé ontology synchronized with the ontology stored in Fuseki?
5. In Section 6.1, the authors state that there are three means to query the ontology, but they describe only two. What is the third?
6. "would be represented by the placeholders" -> "would be represented by variables"
7. What do the authors mean by "most recent relations"?
== Information Extraction from Tale Texts ==
1. "They trained" -> they who? The authors of the module?
2. The last two sentences of section 7.1 are generic. The quality of the paper could be improved if the authors better quantify the advantages and disadvantages of their approach.
== Conclusions ==
1. I suggest restructuring the section in order to provide a better description of what has been done, e.g. starting with "In this paper we have presented ProppOntology, an ontology for modelling folktales based on Vladimir Propp's theory Morphology of the Folktale" and then briefly describing the steps followed by the authors and the main results that they achieved
== Spelling & Grammar ==
I suggest that the authors thoroughly check the spelling and grammar of the paper, including the following:
• page 1, column 1, line 40: in the beginning -> at the beginning
• page 1, column 2, line 47: help assessing -> help assess
• page 2, column 1, line 27: ist structured -> is structured
• page 2, column 1, line 27: subsequent -> following
• page 2, column 1, line 38: conclusion -> conclusions
• page 2, column 2, line 8: add comma after "furthermore"
• page 2, column 2, line 37: intial -> initial
• page 2, column 2, line 40: figur -> figure
• page 3, column 2, line 8: remove comma before "fights"
• page 3, column 2, line 40: neccessary -> necessary
• page 3, column 2, line 41: opinion which -> opinion on which
• page 3, column 2, line 41: analyses analyses -> analyses
• page 3, column 2, line 43: vocabulary -> vocabularies
• page 3, column 2, line 43: like -> such as
• page 3, column 2, line 45: like -> such as
• page 4, column 2, line 11: different -> differently
• page 4, column 2, line 29: rdf:comments -> rdfs:comments
• page 4, column 2, line 37: Types -> types
• page 5, column 2, line 16: prevalant -> prevalent
• page 5, column 2, line 21: like -> such as
• page 6, column 1, line 19: extration -> extraction
• page 6, column 2, line 1: with folktales -> of folktales
• page 6, column 2, line 11: Following -> Following the
• page 6, column 2, line 36: this order -> and this order
• page 8, column 2, line 46: invididuals -> individuals
• page 9, column 1, line 41: respectively -> and respectively
• page 10, column 2, line 6: productive -> production
• page 10, column 2, line 40: fusekis sparql -> Fuseki's SPARQL
• page 11, column 1, line 39: checkbox behind -> checkbox beside
• page 13, column 1, line 6: gramatically -> grammatically
• page 13, column 1, line 15: Therfore -> Therefore
• page 13, column 1, line 35: approach method -> method
• page 13, column 2, line 12: occurences -> occurrences
• page 14, column 1, line 20: add comma after "Instead"
• page 14, column 1, line 21: webinterface -> web interface
• page 14, column 1, line 23: front end -> frontend
• page 17, column 1, line 49: audiencce -> audience
• page 17, column 2, line 3: javascript -> JavaScript
• page 17, column 2, line 24: humanities -> Humanities
• page 17, column 2, line 40: webprotégé -> Webprotégé
|