Building Geoscience Semantic Web Applications Using Established Ontologies

Tracking #: 1187-2399

Authors: 
Matthew Mayernik
M. Benjamin Gross
Jon Corson-Rikert
Mike Daniels
Erica Johns
Huda Khan
Keith Maull
Linda R. Rowan
Don Stott

Responsible editor: 
Boyan Brodaric

Submission type: 
Full Paper
Abstract: 
Interplays between local ontology development and the establishment of wider ontology connections are fundamental to the Semantic Web. This paper discusses the goals and work of the EarthCollab project, focusing on ontology selection, consolidation, and reuse driven by geoscience use cases. The EarthCollab project is a collaboration between UCAR, Cornell University, and UNAVCO to leverage semantic technologies to manage and link geoscientific information and resources. EarthCollab is using the VIVO Semantic Web software suite to support the discovery of information, data, and potential collaborators within the geodesy and polar science communities. This paper presents the EarthCollab ontology design approach, which is heavily emphasizing ontology reuse, and discusses how the different needs of each use case have informed our ontology selection and design. The EarthCollab project is bringing together the VIVO-Integrated Semantic Framework (VIVO-ISF) ontology, the Global Change Information System (GCIS) ontology, and the Data Catalog (DCAT) ontology, among others, to support diverse use cases related to the discovery of geoscience information and resources. Understanding the challenges and solutions for ontology reuse are critical to informing key decision points for new semantic web applications in deciding when to reuse existing ontologies and when to develop original ontologies.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Reject

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Xiaogang Ma submitted on 04/Nov/2015
Suggestion:
Minor Revision
Review Comment:

The manuscript is recommended for publication after moderate revision. For comments and suggestions see below.

The manuscript presents an ontology design approach in EarthCollab, a collaborative project aiming at leverage semantic technologies to manage and link various geoscience information and resources, such as dataset, publication, people, organization, field station, platform, instrument, event, and more. The presented approach covers topics of ontology selection and reuse, use case driven ontology engineering, and adaptation of a platform (i.e. VIVO) for implementing Semantic Web applications. The topic is of significant value to the community of earth science informatics, as well as the community of computer science. The manuscript provides detailed background information and examples in first-hand experience, which make the reading enjoyable. I only have a few suggestions on the Discussion section and the organization of sections 3, 4 and 5.

At the end of the Introduction section, it is stated that the manuscript addresses two main questions:
- What are the key decision points for new Semantic Web applications in deciding when to reuse existing ontologies and when to develop original ontologies?
- How can new Semantic Web projects most efficiently and effectively identify and select ontologies to reuse?
The literature review in the Background section summarizes guidelines, methods and recommendations for ontology reuse or development, and the Geoscience Semantic Web Applications listed several projects sharing common topics with this manuscript. However, the Discussion section does not provide clear answers to the two questions, though a few points are discussed in that section, such as data interoperability facilitated by ontology reuse, combined paths for ontology design and application development, and ontology maintenance and versioning. I would suggest the authors to extend the Discussion section by incorporating information from the Background section and experience from the current sections 4-7, and provide summarized approaches for the two main questions. For example, could you give a list of key decision points for question 1 and a suggested workflow for question 2? Those will be beneficial ‘take home message’ for readers who work on similar applications.

Sections 4, 5 and 6 offer a comprehensive introduction about the environment and components that support the work presented in this manuscript. It is okay to use a sequential list of sections to organize those components, but I would suggest the authors to combine Sections 4, 5 and 6 in to one section, and introduce the small topics in sub-sections. Especially, could you use a diagram to demonstrate the environment and components in your work and the relationships among them? Perhaps you can even extend such a diagram to a workflow. In either format this could be helpful to address the question 2 you listed earlier (How can new Semantic Web projects most efficiently and effectively identify and select ontologies to reuse?).

About some detailed parts in the manuscript:
In the middle of Section 7.1 it is stated that ‘Domain-specific ontology portals, such as http://ontobee.org or http://bioportal.bioontology.org/, are of limited utility for projects outside of those domains.’ You may add a reference to the ESIP Earth Science Ontology Portal (http://semanticportal.esipfed.org/) and offer some comments on how to promote that portal for both ontology curation and use.

Above Fig. 5, it is stated that ‘these context classes allow the designation of author order on a particular document.’ Could you offer more details on how this is realized? Also the sentence ‘In order to apply the Structure of the paper’ at the end of that paragraph is not completed.

The first point discussed in Discussion section is data interoperability. Could you write a little more about the data interoperability between EarthCollab VIVO and other VIVO applications?

Review #2
By Maria Poveda submitted on 16/Nov/2015
Suggestion:
Reject
Review Comment:

This paper describes the approach followed to build the EarthCollab ontology focusing on the reuse activity. In general, the work presented in interesting in terms of ontological development applied to a given use case.

My main concern about the paper is whether it fits the "full paper" track instead of reshaping it for the "ontology description" track. I mean, if the proposed approach for reusing ontologies is considered the paper's contribution, such approach should be described in terms of methodological guidelines or method description and it would be nice to evaluate it through several uses cases in different domains. Then, it also should be compared with the ontology reuse state of the art. If we considered the resulting ontology, which is partially described in the paper, the contribution of the paper, IMO it would fit better in the "ontology description track".

Considering that the paper would be submitted to the ontology track I would propose the following modifications apart from the guidelines for ontology descriptions provided by the journal (http://semantic-web-journal.net/authors#types):

1.- First of all the ontology should be available on-line. If it is not public, the reviewers should be able to access to it under some arrangement. This would also apply to the full paper submission, I would have liked to see the ontology and check it.

2.- Provide an overview of the methodology used to develop the ontology, not only for the reuse activity, but the entire development, including the ontology evaluation.

3.- Regarding the related work about ontology reuse, I would suggest to consider to include references to or taking these approaches into account along the development:
.-- Reuse of domain ontologies in: Suárez-Figueroa, M. C. (2010). NeOn Methodology for building ontology networks: specification, scheduling and reuse (Doctoral dissertation, Informatica).
.-- Schaible, J., Gottron, T., & Scherp, A. (2014). Survey on common strategies of vocabulary reuse in linked open data modeling. In The Semantic Web: Trends and Challenges (pp. 457-472). Springer International Publishing.
.-- Reuse of Ontology Design Patterns in: Presutti, V., Blomqvist, E., Daga, E., and Gangemi, A. (2012). Pattern-Based Ontology Design. In Suárez-Figueroa, M. d. C., Gómez-Pérez, A., Motta, E., and Gangemi, A., editors, Ontology Engineering in a Networked World., pages 35–64. --> ODP reuse might be useful for addressing the "event" modelling issue described in 7.2.

4.- In the introduction, the sentence "The principle of the “open world” is another key component of the Semantic Web, stipulating that new statements about Semantic Web resources are always possible, and that there should be no assumption that a URI uniquely refers to an individual resource [14]." is confusing. I would suggest to rephrase it making a separation between the Open World Assumption and the non unique naming assumption. In fact, the last part should be review to make clear that in Semantic Web one should assume that a URI uniquely refers to one resource, what one can not assume is that there is only one URI to identify a given resource. That is, two or more URIs can refer to the same resource.

5.- Notation used in figures should be clarified. While it is clear in figures 3, 4 and 5, the symbols and representation used in figures 6 and 7 is confusing. Do hexagons represent instances? In that case, would that it means that in Fig 6 the instance "Norwegian Polar Institute" is related to the class "foaf:Person" by means of the property "ec:hasLiason"? In that case, the ontology would fail under the OWL full profile, is that intended? In general, it seems that the representation is mixing the conceptual and the data level. It is also not clear what does mind the hexagons behind the "skos:ConcpetShcema" class in Fig 7, rdf:type maybe?