Hybrid reasoning in knowledge graphs: Combing symbolic reasoning and statistical reasoning

Tracking #: 2200-3413

Authors: 
Weizhuo Li
Guilin Qi
Qiu Ji

Responsible editor: 
Guest Editor 10-years SWJ

Submission type: 
Other
Abstract: 
Knowledge graph, as a backbone of many information systems, has been created to organize the rapidly growing knowledge in a semantical and visualized manner. Symbolic reasoning and statistical reasoning are current mainstream techniques that play important roles in knowledge completion, automatic schema constructing, complex question answering, explanation of AI. However, both of them have their merits and limitations. Therefore, it is desirable to combine them to provide hybrid reasoning in a knowledge graph. In this paper, we present the first work on the survey of methods for hybrid reasoning in knowledge graphs. We categorize existing methods based on problem settings and reasoning tasks, and introduce the key ideas of them. Finally, we re-examine the remaining research problems to be solved and outlook the future directions for hybrid reasoning in Knowledge graphs.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 21/Jun/2019
Suggestion:
Minor Revision
Review Comment:

This paper surveys a number of different tasks on Knowledge Graphs that require the combination of statistical and logic-based deduction techniques. The tasks that are discussed in the paper are interesting and very topical, and I believe that they will continue to attract a lot of interest in the next few years. The list of references provided is useful and quite exhaustive.

My main concern with the paper is the quality of the writing, which in my opinion requires significant improvement. As a first step, I would strongly advise the authors to get their paper checked by a fluent English speaker. See below a (non-exhaustive) list of minor comments for the authors to address.

- Page 1. The following sentence doesn't parse in English: " Knowledge graph (KG), as a backbone of many information systems, has been created to organize the rapidly growing knowledge in a semantical and visualized manner".

- Page 1. " A fact in a knowledge graph is usually represented as a triple of the form (head entity, relation, tail entity)"..."e.g., (Barack Obama, BornIn,
Honolulu, Hawaii, U.S)". It is unclear what the elements of the triple in the example are.

- Page 1. It is unclear what the authors mean by the "incompleteness of triples".

- Page 2. The following sentence doesn't parse in English: "Unfortunately, no single method can be competent for knowledge reasoning perfectly."

- Page 2. I didn't understand what the authors meant by "soften symbolic reasoning in
order to be compatible with objective facts well."

- Page 3.The following sentence doesn't parse in English: "Existing knowledge graphs mostly contain a large number of factual knowledge and a small number of schematic knowledge."

- Page 6. There seems to be some confusion between the terms "query answering" and "question answering". The former typically refers to the computation of all answers to a query expressed in a formal language such as SPARQL, whereas the latter typically refers to questions expressed by users in natural language.

- Page 6. I didn't understand the sentence "Chen et al. [58] exploited the semantics of data streams interpreted in ontologies to tackle the problem of concept drift." In particular, I don't know that the semantics of data streams interpreted in ontologies is.

- Page 7. "It is still challenging for existing methods to extend the set of rules to more complex nonmonotonic ones such as existential variables or disjunctions in rule heads." I didn't understand this: neither disjunctions in the head or rules, nor existential variables, introduce non-monotonicity.

Review #2
By Axel Polleres submitted on 09/Aug/2019
Suggestion:
Accept
Review Comment:

Hybrid reasoning in knowledge graphs: Combing symbolic reasoning and statistical reasoning

Submitted by Guilin Qi on 05/06/2019 - 11:49
Tracking #: 2200-3413

Since this is an entry for the Editorial board papers, I am reviewing it in a light weight fashion, and didn't put any revision needs on it. still, I'd recommend the authors to take into consideration the comments below.

Please find attached also an annotated PDF with handwritten notes.
Some points in more detail below:

* Please check singular/plural mix in some sentences, articles and word-order and in general maybe have the whole paper grammatically proof-read. I marked several things I noticed in the attached PDF; anyway (as my handwriting is probably hard to read, don’t hesitate to get back to me, if you can’t read it.

* When you talk about "query answer” on page 2, you really mean “question answering (QA)” right? I thinks these terms shouldn’t be mixed/used interchangeably, because they refer to different things: the latter to answering (natural language) questions, the former to structured queries in a query language, which I think you didn’t mean.

* When you mention definitions of KGs, you may also want to look at our Dagstuhl report, where one chapter was about KG definitions:

cf. Piero Andrea Bonatti, Stefan Decker, Axel Polleres, and Valentina Presutti, editors. Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web (Dagstuhl Seminar 18371), volume 8, Dagstuhl, Germany, 2019. Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik. http://drops.dagstuhl.de/opus/volltexte/2019/10328

* On p.3 something’s wrong with footnote 0 … not referenced in the text, it seems.

* p.3: I think “transnational distance models” should be "*translational* distance models"

* p. 4 TCE takes two structured information —> TCE takes two kinds of structured information

* I am not entirely clear about the separation between sections 3.2 and 3.3 topicwise, as they seem to cover similar issues/methods: Could/should they be combined or can you make the separation a bit clearer? I find it hard to grasp the concrete common task addressed by the methods in 3.3, please clarify.

* 3.5 as mentioned above should IMHO be Question Answering. FYI, you also may want to have a look at our latest CIKM paper on QA, I can send you a pre-print, if you want:

Svitlana Vakulenko, Javier Fernández, Axel Polleres, Maarten de Rijke and Michael Cochez. Message Passing for Complex Question Answering over Knowledge Graphs. to appear in CIKM2019.

* On the combination of rules and ML/statistical methods for KG enrichment, you may also want to check Stefan Bischof’s thesis, who did that for a concrete domain: https://aic.ai.wu.ac.at/~polleres/supervised_theses/Stefan_Bischof_Disse...

* I was a bit wondering why you didn’t mention RDF2Vec and GloVe when talking about graph embeddings:

Cochez, M., Ristoski, P., Ponzetto, S.P., Paulheim, H.: Biased graph walks for RDF graph embeddings. In: WIMS 2017. pp. 21:1–21:12 (2017)[6]

Cochez, M., Ristoski, P., Ponzetto, S.P., Paulheim, H.: Global RDF vector space embeddings. In: ISWC 2017. pp. 190–207. Springer (2017)

Last, but not least: I like the quite comprehensive literature list of the paper! Maybe I would have hoped for one or two more critical take-aways in your open challenges in the conclusions… i.e., in terms of an estimate, how far off we are in solving these or whether the open problems you mention are feasible/solvable at all in the near future (e.g. getting existentials, disjunction, or in general complex axioms into rule learning seems to be a pretty hard nut to crack). I think in the special issue article, we are free to add opinions, and your view on that would be appreciated.

One more thing: as for the own refernces I mentioned, please feel free to ignore them, I just thought they might be interesting for you, I don’t mean to push in citations to our own work.

best regards,
Axel


Comments

Thanks for submitting the paper. I'm not a reviewer, but have read the paper and have two questions:
1. the work on RDF2Vec seems strangely absent from your overview. Is that for a reason?

2. I'm somewhat confused by your six categories (section 3). Some of your categories seem to me to be methods ("statistical relational learning") that could be used for many different goals, while others seem to me to be goals ("knowledge alignment") that could be achieved with many different methods. Does it make sense to have such mixed categories in your list? Or am I mistaken?

Thanks for the comments. For the first question, we did not include RDF2Vec because we cannot cite all the paper about kg embedding, but we agree that RDF2Vec is important and will add a reference for it. For the second question, thanks for pointing out this, indeed, we would like to classify methods according to goals, so we will make this clear and modify the paper.

... on top of the review, I sent my handwritten notes with some additional editorial suggestions and typo corrections to the author per email.