ClioPatria: A Logical Programming Infrastructure for the Semantic Web

Tracking #: 988-2199

Authors: 
Jan Wielemaker
Wouter Beek
Michiel Hildebrand1
Jacco van Ossenbruggen

Responsible editor: 
Axel Polleres

Submission type: 
Tool/System Report
Abstract: 
ClioPatria is a comprehensive semantic web development framework based on SWI-Prolog. SWI-Prolog provides an efficient C-based main-memory RDF store that is designed to cooperate naturally and efficiently with Prolog, realizing a flexible RDF-based environment for rule based programming. ClioPatria extends this core with a SPARQL and LOD server, an extensible web frontend to manage the server, browse the data, query the data using SPARQL and Prolog and a Git-based plugin manager. The ability to query RDF using Prolog provides query composition and smooth integration with application logic. ClioPatria is primarily positioned as a prototyping platform for exploring novel ways of reasoning with RDF data. It has been used in several research projects in order to perform tasks such as data integration and enrichment and semantic search.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Minor revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Christoph Redl submitted on 23/Feb/2015
Suggestion:
Minor Revision
Review Comment:

The paper presents a semantic web framework based on SWI-Prolog extended by a SPARQL and LOD server, called ClioPatria. This is motivated by the problems introduced by typical three-tier architectures consisting of separated data and logic layers. In particular, the actual computation is often distributed over the data and the logic layer.

The basic idea of the system is to represent RDF triples as ternary atoms over a dedicated rdf predicate. In addition, several other predicates are defined which support various reasoning tasks, such as computing the transitive closure of subclass relations and SPARQL query processing. The predicates can naturally be used both for querying and for filtering. ClioPatria may be extended by additional libraries and web services.

The paper is well written and easy to follow. It makes use of numerous examples which explain the introduced concepts and predicates.

While the presentation is good, my comments concern more the content of the paper:

1. My first doubt concerns the scope of the contribution. In the first part (Abstract and Sections 1 and 2) it always talks about SWI-Prolog's RDF store, which gave me the impression that RDF is already supported by SWI-Prolog and not part of the contributions of ClioPatria. In that case, the system is essentially a Prolog module which defines some additional predicates as demonstrated in the examples (such as for computing transitive closures). In contrast, in the application section and the conclusion it seems more that the RDF capabilities are considered to be part of ClioPatria, which would add to the significance of the work. According to the website, RDF support for SWI-Prolog and ClioPatria were at least developed by the same persons, but it is unclear if they are two separate (though depending) features. If the former is considered to be part of the latter, this should be stated more clearly.

2. Another comment concerns the selection of the examples. The features demonstrated by the examples are quite simple (e.g. transitive closure), while more sophisticated reasoning tasks (s.g. SPARQL queries) are supported according to the list of contributions, but are described only informally in the remaining part of the paper. It would be good to add one such example to Section 2.3.

3. The related work section mentions several other approaches, among them CiaBATA. Except for the underlying formalism (Prolog vs. ASP), ClioPatria appears to be very similar. It was not clear to me whether ClioPatria is just a Prolog equivalent of CiaBATA or whether there are any fundamental differences or advantages (note that the combination with other external sources as relational databases, as mentioned in the paper, is also possible in GiaBATA).

I did not find any minor issues such as typos. As this is a system presentation, the absence of formal results is ok.

In summary, the paper is well written. While comments 2. and 3. can be addressed in the final version, the significance of the contribution depends on how issue 1. is answered. If the RDF support as such is part of the system, it is worth publishing (but the differences to other approaches need to be stated more clearly, 3.); if RDF support is considered to be part of SWI-Prolog, the remaining contributions are low. Since I believe that the former is the case, my final decision is minor revision.

Review #2
By Alessandra Mileo submitted on 04/Mar/2015
Suggestion:
Minor Revision
Review Comment:

This manuscript was submitted as 'Tools and Systems Report' and should be reviewed along the following dimensions:
(1) Quality, importance, and impact of the described tool or system (convincing evidence must be provided).
(2) Clarity, illustration, and readability of the describing paper, which shall convey to the reader both the capabilities and the limitations of the tool.

This paper describe the ClioPatria toolkit based on SWI-Prolog for Semantic Web applications, and it documents how it has been used in different application scenarios and projects.

The motivations illustrate with reasonable clarity how Prolog can help easier and more flexible specification of SPARQL queries, highlighting the main advantage in easier writing of nested and sub- queries, increased expressiveness w.r.t. SPARQL 1.0 and to some extent also to SPARQL1.1, and the ability to specify entailment modules (e.g. to express RDFS inference), as well as custom modules similar to JENA function properties, both using Prolog rules.

In general the paper is well written and presents clearly the features of the system, although I would encourage authors to identify more clearly what are the features of ClioPatria that turn out to have an impact in each of the projects mentioned.
Also, a clearer identification of ongoing work and future impact, as separated from limitations and discussions on open issues, should be presented.

Some specific comments below:

- when describing the features, authors mention the ability to "retry": does this refer to back jumping?

- although the declaration of triples and some queries (like subqueries) results more intuitive in SWI Prolog, how about certain other SPARQL 1.1 construct like aggregates and property paths? Does ClioPatria support a version of aggregates that does not require specifying rules to find, for example the first or the first N triples ordered by the value of the object/subject?

- example in 2.1.2. requires more rules to express what is rdf_reachable/3. This means that building entailment modules requires to specify something like a JENA function. What is the additional value of having SWI-Prolog modules instead of JENA function properties?

- does Cliopatria use mechanisms to automatically decide whether backward of forward chaining is more efficient based on some structural properties of the query, as mentioned at the end of Sec 2.1.2? Or does this have to be decided at design time?

- when describing the triple store in Sec 2.2, I would like more motivations around the choice of a C-implementation vs the use of dynamic predicates.

- in Sec 2.2 authors mention indexes and maintainment of specific relation indexes to speed up entailment. Is this application dependent or is there a set of recommended RDFS properties that should be maintained? What about custom-defined entailment modules?

- Sec 2.4 mentions the need for packages and extensions to follow strict guidelines. Are these guidelines somehow summarized somewhere? As a system report, that should be clarified.

- although a complete comparison with AllegroGraph is out of scope, it would be good to briefly say more about how it positions in Related work and how it compares with ClioPatria.

- in Section 4, would be good to recall the specific role of ClioPatria (as an extension/enhancement of SPARQL) in each application for each domain. Despite these are mentioned in the introduction, it would be good to provide a summary for each application area. Sometimes this is done, e.g. in the second paragraph of 4.1, but sometimes it is quite vague as in /facet and in the last paragraph of 4.1.

- section 4.4 .says the ClioPatria triple store "stresses dynamic data". What does that mean? inference and search in the system doesn't seem to be tailored to data streams (= dynamic data as I see it) so this statement is a bit confusing.

- section 4.6.1 refers to the use of Prolog for planning in robotics? This is not clear and should be explicitly mentioned.

- In general, the status of each project in section 4 should be clearly stated (ongoing, finished, just started, etc). This would have an overview of the global impact and timespan

- Future Work section should actually be discussion and limitations. I would expect a future-work section to illustrate what are the next steps in ClioPatria, the support for developers and a timeline or a list of priorities in ongoing extensions/applications of ClioPatria. There are no space limitations, therefore a better analysis of the future of the toolkit and its potential impact/directions should be provided.


Comments

All my comments have been addressed in the revision. Most importantly, the authors clarified that also the underlying RDF triplet store is part of the presented system and not only the defined predicates. This was not clear in the previous version. After the clarification the contribution appears to be significant and my doubts in this respect have disappeared.