Optimizing Tableau Reasoning: a Prolog-based Framework

Tracking #: 2415-3629

Authors: 
Riccardo Zese
Giuseppe Cota

Responsible editor: 
Bernardo Cuenca Grau

Submission type: 
Full Paper
Abstract: 
One of the foremost reasoning services for knowledge bases is finding all the justifications for a query. This is useful for debugging purpose and for coping with uncertainty. Among Description Logics (DLs) reasoners, the tableau algorithm is one of the most used. However, in order to collect the justifications, the reasoners must manage the non-determinism of the tableau method. For these reasons, a Prolog implementation can facilitate the management of such non-determinism. The TRILL framework contains three probabilistic reasoners written in Prolog: TRILL, TRILLP and TORNADO. Since they are all part of the same framework, the choice about which to use can be done easily via the framework settings. Each one of them uses different approaches for probabilistic inference and handles different DLs flavours. Our previous work showed that they can achieve sometimes better results than state-of-the-art (non-)probabilistic reasoners. In this paper we present two optimizations that improve the performances of the TRILL reasoners. In the first one, the reasoners build the hierarchy of the concepts contained in a knowledge base in order to quickly find connections among them during the expansion of the tableau. The second one modifies the order of application of the tableau rules in order to reduce the number of operations. Experimental results show the effectiveness of the introduced optimizations. All systems can be tried online in the TRILL on SWISH web application at http://trill-sw.eu/.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Reject (Two Strikes)

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 13/Mar/2020
Suggestion:
Minor Revision
Review Comment:

The paper has been improved in many places, in particular the use of pseudocode to describe the algorithms. A few presentation issues remain.

There was confusion about "rule applicability" in another review, and the reason is that this notion is never defined in the paper. When reading the rules in Figure 1 from the standpoint of a classical tableau algorithm, everything seems fine:

(1) IF a certain assertion is in the tableau, and IF certain other assertions are not in the tableau, THEN add these assertions.

In this setting, the rule is applicable whenever the two "IF"s hold, and the rule application means that the "THEN" part is executed. In the setting of the current paper, however, there is another part to each rule, which is to extend the tracing function \tau. But this should be executed regardless of whether the second "IF" is satisfied (which only ensures that the assertion that is traced is actually present). Hence, the rules should be of the following form:

(2) IF a certain assertion is in the tableau, THEN
(2a) IF certain other assertions are not in the tableau, add them; and
(2b) extend the tracing function of these assertions.

That is, applicability means only that the first "IF" should hold, and the rule application encompasses both the second "IF" and the tracing step.

But even this does not completely clarify "applicability" of a rule. As the authors note (in the appendix!), applicability (as checked by the update_ABox predicate), also depends on whether step (2b) would actually extend the tracing function. If \tau is not changed by this step, then the rule is not actually considered applicable (otherwise this would lead to infinite cycles of rule applications).

For similar reasons, the phrases "the algorithm terminates with any rule application order" and "if a rule is applicable on a tableau, it will be so also in every its expansion" (used several times in the paper and the response) are confusing, because all the citations in these sentences are to papers about classical tableau algorithms. Clearly, the question is not about termination of the classical tableau algorithms (as proven in the cited papers), but of their extensions with a tracing function, where applicability is defined differently.

Hence, the authors need to clearly define what "applicable" means in Section 2.2.2 and adapt the presentation of the rules accordingly.

Smaller issues:

Even though the use of \tau(R^-,) for \tau(R,) is intuitive, it nevertheless needs to be defined in the paper.

In the response, the statement "S=R is equivalent to R\sqsubseteq S and S\sqsubseteq R" is wrong. This holds only for the _entailment_ of subclass and equivalence (\equiv) axioms. And even if this is supposed to read that R\sqsubseteq S and S\sqsubseteq R are _axioms entailed by the KB_, this does not mean that R _equals_ S.

In Figure 1, "S-neighbour" cannot be defined simply by saying "linked in the KB". First, the link is in the tableau, not in the original KB. Second, since there is no explicit rule for role inclusions, nodes x and y that are "linked" by R such that "R \sqsubseteq S" also need to be considered S-neighbors (e.g. for the "\geq / \leq" rules), and, more importantly, this role inclusion (or possibly several) needs to be added to the tracing function.

p4: "K \ E1" -> "K \ {E1}" etc.

p5 l19: "\ge nS" -> "\ge n S.C" (two times)

p5 l28: "... \cup" -> "\cup ... \cup"

Fig. 3: It is confusing that "apply_nondet_rules" is called from "apply_det_rules". Rather, it should be called directly from "apply_all_rules".

p10 l49: "these two classes and all their subclasses are unsatisfiable": If a subclass is disjoint to a superclass, then only the subclass is necessarily unsatisfiable, but not the superclass.

Fig. 6 l12: It is very confusing if sometimes the order "subclass,superclass" is used (e.g. in "add_link_id" or "add_subclass_expl") and sometimes "superclass,subclass" (e.g. in "are_subclasses" or "add_edge"). This needs to be changed.

Fig. 6 l16: Only the edge between "SubClass" and top (0) is removed, but what happens if SubClass already has another superclass that is also a superclass of "SupClass"?

Fig.7 l3: "AsstoExpand" -> "AssToExpand" or better "AssertionToExpand"

p14 l23: "subsumed by" -> "subsuming"

p18: The sentence starting with "Tables 9, 10 and 11 show ..." should come before the sentences talking about the cells of these tables.

p18 l48: "the application of the -> unfold rule is almost instantaneous.": In any reasoner, one application of this rule is usually instantaneous.

Table 13: "trillp" -> "\trillp"

p30 l40: "To facilitate the reading ...": The hierarchy is not a tree, so it should not be represented as a tree nor called "TreeH" (e.g. in Fig. 14).

Instead of [25], why not cite the paper from LPAR 1999?

The paper would really benefit from proof-reading by a native English speaker, here are some examples:

"has not any" -> "has no" or "does not have any" (many times)

"will be so also in every its expansion" -> "will be so in every expansion"

p9 l16: "with" -> "to"

p9 l42: "taken in input" -> "taken as input" or "from the input"

p10 l21: "pair key-value" -> "key-value pair"

p12 l32: "by" -> "of"

p14 l23: "what contained" -> "what is contained"

p15 l23: "To facilitate the reader that" -> "To help the reader who"

(here I stopped)

Review #2
Anonymous submitted on 29/Apr/2020
Suggestion:
Reject
Review Comment:

The authors substantially revised their article about two optimization techniques for (Prolog-based implementations of) tableau algorithms for Description Logics and addressed several issues pointed out by the reviewers. This includes the replacement of Prolog code with pseudo code, removing irrelevant descriptions, and improving the evaluation. Moreover, they fixed several bugs in their implementation and provided source code as well as evaluation data to reproduce the results, which I really appreciate.

Although the article improved quite a bit and I still consider the Prolog-based reasoners as quite interesting for the Semantic Web community, I do not think that the article is (close to being) ready for publication. It still seems to be a mix of a system description and a research paper. In fact, they intensively introduce and describe their systems with many aspects that are irrelevant for the understanding, the implementation, or the evaluation of the proposed optimizations (e.g. architecture with used libraries). I wonder whether it is even relevant that the system is implemented with Prolog since the optimizations do not use any Prolog-specific features and do also do not provide special benefits for Prolog-based implementations.

Despite the updated related work and the rebuttal, I still do not think that the presented optimizations are particularly novel or very challenging. In fact, the discussion with the related work gives the impression that the proposed optimizations are very similar with existing ones or that existing ones can easily be extended to achieve a similar effect (e.g., by extending saturation rules with a tracing function, which seems straightforward since they are just simplified tableau expansion rules). Some aspects of the optimizations even seem trivial. For example, it is clear that deterministic rules should be applied before non-deterministic rules or that one should trigger the rule applications via added (concept) facts in the completion graph (instead of naively testing the applicability of each rule after each change).

Although the authors fixed some bugs in the implementation, the reasoners still seem to have several issues. In particular, inverse roles are ignored or result in reasoning timeouts even for trivial test cases (tested with the online version of TRILL).

The authors tried to clarify why the order of the rule application is not relevant. It is clear that this is the case for checking consistency/satisfiability with the tableau algorithm. Unfortunately, I still can't see from their explanations, why this also holds for determining the justifications. In fact, the presented tableau algorithm only unfolds A to C for an individual/node x in the completion graph with x:A and the axiom A \sqsubseteq C in the knowledge base if C is not already in the label of x. If we additionally have the axioms A \sqsubseteq B and B \sqsubseteq C in the knowledge base and the concept A has already been unfolded to C for x, then we may only be able to unfold A to B for A \sqsubseteq B. Consequently, \tau only contains the axioms x:A and A \sqsubseteq B for the fact C(x). It is clear that the HST algorithm is used for the classical tableau algorithm to find the other justifications (e.g., x:A, A \sqsubseteq B, and B \sqsubseteq C for the instance query w.r.t. concept C and the given assertion x:A), but it is unclear how this is handled by their Prolog-based implementation. It is mentioned that Prolog is used for backtracking, but it is not clear to me whether this is also the case for the "don't-care" non-determinism (i.e., in which order the expansion rules are applied). It is clear that one can find all justifications if the expansion rules are applied in each possible way (e.g., by first applying the unfolding rule for A \sqsubseteq B and then for B \sqsubseteq C, we obtain the other justification). However, this does not seem to work if one applies deterministic rules always before non-deterministic rules and could easily be problematic for the performance (for non-trivial ontologies). Another possibility would be to extend the tracing function such that \tau is extended for a fact even if the fact is already present (such as in [55]). However, this is not indicated from the presented tableau algorithm. The implementation clearly seems able to handle such cases (now), but the paper does not describe it sufficiently.

I further think that the evaluation can and should be made more compact, e.g., by presenting only the most important results (maybe only for one reasoner as also suggested by other reviews). In particular, there are several tables that do not reveal any interesting information (e.g., Table 4, 6, 7, 8) and for several tables it would be sufficient to just present one column/row (e.g., Table 9, 10, 11). Of course, there is no direct page limit, but there is also no point in presenting uninteresting results. I would also prefer an evaluation that covers the consistency checking/justification computation of much more real-world ontologies (e.g., several hundred ontologies from ORE2015 or the Oxford ontology repository, where unsupported axiom are removed or ontologies with unsupported axioms are filtered out).

It is further unclear to me, why probabilistic reasoning is still so intensively discussed and evaluated in the paper. In fact, the presented optimizations are not directly related to probabilistic reasoning. It is clear that probabilistic reasoning under the DISPONTE semantics benefits from a faster computation of justifications, but this is also the case for many other application scenarios, i.e., probabilistic reasoning can be reduced to a simple motivation for requiring all justifications (which can be realised with one or two sentences). In my opinion, it is also not necessary to show evaluation results w.r.t. probabilistic reasoning in the paper.

Unfortunately, the paper still contains several errors as well as typos and has a lot of room for improvements. Many sentences are difficult to read and there are also some issues with the grammar. Also, a few statements are quite often repeated, e.g., that the reasoners are implemented in Prolog and that they are part of the TRILL framework, or where to find/download the reasoners and the evaluation resources. Moreover, some parts are insufficiently presented while other sections seem totally irrelevant for the paper and the actual research question.

In summary, I still recommend to reject the article due to the large amount of errors and presentation issues, the low originality, and the fact that the progress so far does not indicate that the article will be in a publishable state any time soon.

Other comments (l for left-hand side, r for right-hand side):
- On DockerHub, there should be an image with the latest tag, otherwise one always has to specify the version. Moreover, it would be great if the evaluation is fully automated, i.e., only one script/command must be executed to get the results.
- Abstract: Is it really useful to introduce the reasoners in the abstract for the given research paper? How are they relevant and is it important to know that they can easily be switched via a framework setting? The reader probably also does not understand what the connection between the mentioned probabilistic reasoning/inference and these optimisations is. You may want to change the focus of the paper (e.g., a system description about efficiently realising tableau reasoning with Prolog) or focus on the actual research question (which should more or less be independent of a particular system). I am also not very happy with the title. At the moment, it is unclear whether you want to optimize tableau-based reasoning or whether you develop a Prolog-based framework that allows for developing and experimenting with tableau-based reasoning optimizations (or even both). It is also unclear to me what the authors mean with "in order to collect the justifications, the reasoners must manage the non-determinism of the tableau method". As far as I know, the non-determinism of the tableau algorithm has to be managed for every reasoning task.
-p1, r, 41: The sentence "Among them, in ..." seems strange.
-p2, l, 10: It sounds strange to me to specify in the knowledge base which reasoner should be used. Isn't that like specifying in a video file which video player should be used? Is that even important for the paper?
-p2, l, 29, "find connections among them": This is a little bit vague and it is unclear how these "connections" are used in the reasoning process.
-p2, l, 33: This sentence breaks the reading flow and it is unclear to me why it is at that position.
-p2, l, 47, or -> and
-Introduction: Until the very end of the introduction section it is not clarified how these optimizations are relevant for probabilistic reasoning.
-p2, r, 23: Is it really necessary to mention/introduce datatypes?
-p2, r, 30: could be -> is
-p2, r, 32: It is not clarified what the set A means. Probably the set of atomic concepts?
-p3, l, 1: "it has not any transitive sub-role" -> "it has no transitive sub-role". Does this definition also consider sub-roles over inverses?
-p3, l, 8: Again, I wouldn't mention datatypes at all. Also, the semantics for the concept constructors seems missing.
-p3, l, 9: What does usually mean? You may also want to specify what a model is and what consistency, entailment, etc. means. Shouldn't the query entailment test be written as K \models_? Q?
-p3, l, 28: The tableau algorithm is usually not directly used to answer queries. I think it would be better to state that the queries/reasoning tasks are reduced to consistency/satisfiability checking and this is typically realised with tableau algorithms (for more expressive DLs).
-p3, l, 38: such that -> for which
-p3, l, 47: If no clashes are found AND it is fully expanded, then the tableau can be unravelled to a model (but you should clarify what a model is).
-p3, r, 4: let's -> let us (or just say)
-p3, r, 8: In the case of application of a -> In the case of applying a …
-p3, r, 13: Special condition sounds strange (what is that special condition?). You may want to directly specify that it is a cycle detection technique.
-p3, r, 18: You mention covering set of justifications, but it is not directly clarified what that means. If the definition is somewhere else, then you should at least give an intuition. Also, is it really necessary to define covering sets of justifications, it seems the same as all justification for a query, i.e., why not just use "all justications"?
-p3, r, 42: Is it really necessary to introduce the pinpointing formula? It only seems to be a different, more compact encoding of the justifications and I do not see how this is relevant for the presented optimizations (it seems to be a technical detail).
-p4, l, 7: I couldn't find Lemma 2.4 or Section 7 in [16].
-p4, l, 31: You may want to shorten the title of the section, e.g., Axiom Pinpointing with Tableau Algorithm.
-p4, l, 35: Shouldn't the tracing function also map inequalities to justifications?
-p4, l, 49: My understanding is that, with the presented tableau algorithm, only one justification can be identified from the clashes (and not the entire covering set of justifications), which is the reason why the HST algorithm is required.
-p4, r, 12: forces -> force. Also, these systems do not force the tableau algorithm to find a new justification since there is no steering/controlling of the tableau algorithm and which rules are applied. In fact, the knowledge base is changed such that the tableau algorithm can find other justification. Furthermore, I do not think that this has anything to do with the programming language.
-p4, r, 27: looks -> it looks
-Figure 1: Is it really necessary to have two ifs for all rules? I think most conditions can be combined, which would simplify the presentation. I also agree with Reviewer #1 that the presentation of all the tableau rules is probably not very helpful if the reader is not already familiar with the algorithm. I think it is not even required to present the rules since the optimizations do not really relay on them (one could just give a few examples how the expansion works). Moreover, since several parts are not really clearly defined it, the presentation of the tableau algorithm raises additional questions. For example, what is the outcome of the Merge function such that it can be used for the tracing function (e.g., \tau(Merge(b_i, b_j)) = …)? Also, for the \geq-rule, \neq(b_i,b_j) with i = j does not make sense. Moreover, \tau for neighbour links is not clear (due to the lack of a proper definition) and the description in the caption is hardly helpful. It is anyway not a good idea, in my opinion, to put so many definitions/explanations into the caption of a figure/table.
-Section 2.3: I do not think that this section is necessary for the purpose of this paper.
-Section 3: Again, I do not think that it is necessary to describe the reasoners and the framework in detail, i.e., I suggest to remove most of Section 3.
-p6, l, 34, TRILL^P "can also compute the probability of the query by representing the formula with a BDD": This seems to be the same as realised with TORNADO. It is unclear what the actual difference is. Also, I don't think that it is required to state multiple times that these reasoners are implemented in Prolog.
-p6, r, 22: I don't think that it is necessary to show an extract of a knowledge base with the directive to specify the reasoner (which, in my opinion, does not make so much sense anyway).
-p7, l, 12: Shouldn't A also contain some information about role assertions? If T contains the information of the Tableau, why do we need A? How is that information relevant to understand the optimizations?
-p7, l, 29: The authors claim here that the order of the rule applications does not affect the completeness for finding the covering set of justifications. But this does not seem true (which is the reason for the HST algorithm). Moreover, I do not agree with "if a rule is applicable on a tableau, it will be so also in every its expansion.". In fact, the unfolding rule is only applicable for a node/individual x if the implied concept is not already in its label, i.e., if the implied concept is derived via another expansion rule (which obviously results in an extension of the tableau), then the unfolding rule may no longer be applicable. These contradictions in the paper are very confusing.
-p7, r, 37: I do not understand what that sentence means.
-Figure 3: The function name for APPLY_DET_RULES is misleading since it also applies non-deterministic rules (if deterministic rules are not applicable). In my opinion, Figure 3 is not necessary if you clearly state that the deterministic rules are applied before the non-deterministic rules and that backtracking is performed if one non-deterministic choice results in a clash.
-p8, r: There is a lot of space between the lines, which looks strange.
-Figure 4: Again, I don't think that it is necessary to show the architecture of the framework in this research paper.
-p9, l, 22: Is it a good idea to have the new version with the optimizations in a separate branch but the old version in the master branch? When do you plan to integrate these optimizations (do you want to wait until the paper is reviewed?) or how do you then update the links in (possibly published) papers?
-p9, r, 27: HierIn can easily be confused with the described hierarchy, but also seems to contain other data structures, e.g., classes.
-p9, r, 29: Is it really a list, where it is checked whether the class is already present?
-p9, r, 33: Isn't the identifier already required to check whether the class is already present?
-p9, r, 43: I do not understand how the management of individuals and roles is relevant for the paper.
-p10, r, 30, "it will consider all these explanations...": How are they considered? This seems one of the most interesting parts of the proposed optimization and it is a pity that it is not described in detail.
-Figure 6: I don't think that it is necessary to describe obvious parameters such as SubClass or SupClass. Also, the comments with line break do not look so nice. The caption mentions eventually, but I do not understand what that means in the given context.
-p11, l, 35: Shouldn't the unfolding rule look for the node that represents C (instead of D) in the hierarchy such that it can identify the subsumed concepts (i.e., the Ds) by following the links?
-p11, l, 50: forces -> force. Again, I do not think that this is a matter of the programming language.
-p11, r, 41, "Every expansion rule is called following a certain order usually defined in the specific system." I don't think that this is necessarily the case. Many state-of-the-art systems use special conditions to trigger the rules, i.e., the applicability of a rule is only checked if certain facts are added and the applicability of the rules may only be checked for these facts.
-p12, l, 23, "every rule has already been applied and its tracing function has not been changed by now": It is not clear to me what that means. Can \tau change for an existing fact in the tableau? This does not seem indicated by the depicted expansion rules. Even if \tau changes for a fact, then other rules are not reapplied, i.e., it does not make sense to change \tau for a fact in the completion graph.
-p12, r, 24: Are other rules than the unfolding rule also exploiting the hierarchy (which seems indicated by that sentence)? If so, how?
-p12, r, 35: Let -> Let us. Also, I do not think that the assumed simplification makes sense.
-p13, r, 43, "Given the similarities of the two approaches, the TRILL systems can easily be extended to perform the classification task": I do not necessarily agree with that since they are used quite differently.
-p15, l, 2, "To facilitate the reader": Sounds strange, do you mean "to facilitate reading"?
-p15, l, 34, "As proved in [17]": Is it really proved or did the results show/indicate it?
-Table 1: It would be great if the Description Logic language is shown for the ontologies. You may also want to add the number of properties and the FMA ontology.
-Table 2: Does it really make sense to abbreviate TORNADO with TORN.? Also, what is the running time? Does it include the loading/parsing of the ontology?
-p16, l, 28: not justifications -> no justifications. Also, it is not clear to me why the reasoners have to explore the entire search space if there are no justifcations. Shouldn't that mean that the algorithm found a counter-model, for which it is not necessary to further explore the remaining search space?
-p16, l, 34: It is not necessary to explain the tables and their columns in the text and in the captions of the tables. I would prefer it if the captions in the tables are more compact, containing only the most important information such that the different tables can be distinguished.
-p16, l, 45: In Table 4, you used TORNADO also for non-probabilistic query answering, which contradicts this sentence.
-p16, r, 29: What do you mean with measurement errors? Shouldn't it be possible to avoid them? Do you mean variance in running time?
-p16, r, 39: I wouldn't use the word prove in this context. I think it is better to say that the results indicate or maybe show something.
-Table 4: You can just say that the probabilistic query answering does not increase the reasoning time, there is no need to show an entire table without any significant change.
-p17, r, 7: I do not agree with that. Don't care means that the results are the same (which, however, also does not hold for collecting justifications) rather than the same rule will be applicable for the same fact in every expansion.
-p17, r, 15: Do you mean that you also create choice points for the application of deterministic rules (since the given ontology is completely deterministic)? If so, this should be explained in more detail and not just in the evaluation section!
-Table 6-8: I don't see that these tables show any interesting results, i.e., they can just be removed.
-p19, l, 46: was few higher -> was a few times higher
-Table 9-11: I think that it would be sufficient to only vary n or m such that you only have one column or row for each reasoner, which would save a lot of space.
-p21, r, 25: were -> was?
-p21, r, 28: What happens after the splitting? I do not really understand the query generation for the BRCA knowledge base.
-Table 13: trillp -> TRILL^p
-p22, r, 34: There are some techniques, which go in the described direction and which you may want to consider, e.g., ABox absorption, module extraction/atomic decomposition, completion graph caching techniques.
-p23, l, 1: I'm not really sure whether this paragraph is particularly relevant for the paper.

Review #3
Anonymous submitted on 26/Jun/2020
Suggestion:
Minor Revision
Review Comment:

In general, in my opinion, the authors have responded the reviewers’ comments very well as well as rewritten almost all crucial parts in each section, both the introduction on SHIQ, the code snippets that have been replaced by pseudo-code, the additional comparison between their work and Konclude, and the number and types of tests used to evaluate the system.

In particular, the pseudo-codes now make the explanation about the optimizations clearer, and most of the tests they used have covered important aspects on what actually needs to be tested, such as testing the performance of the system when answering queries that (do not) have justifications, when dealing with non-deterministic rules, or when dealing with the number of justifications that increase exponentially.

Looking at all the revision they made, it seems that this paper is still reasonable to be accepted, but minor revisions, in terms of the technicality of the evaluation as well as the quality of writing, still needs to be done.

Technical questions:
The only test that somehow still lacks motivation is test 5. What is actually the purpose of test 5? It seems that the authors mostly wrote about how the KBs look like, the adaptation of a method that has been used by Klinov and Parsia, assignment of probability values to each simple class, and explaining the result of the test. However, I still cannot see what aspects that this test 5 really wants to show to challenge the optimization they proposed.
I was actually wondering whether there are ontologies used in the test that have axioms containing concepts of the form of qualified number restrictions, in particular, concepts that have constructor at most, that will also trigger non-deterministic rules. I understand that test 3 is used to test the performance of the system when dealing with non-deterministic rules. However, this test only concerns with disjunctions and only plays in the level of concept names. But, how about the test that challenges the system on dealing with non-determinism rules that are triggered mostly by, e.g. concepts with constructors at most, since this kind of concept will mostly play with individuals and choose which individuals that need to be merged. To make it more challenging, the test can also be performed similarly as test 2, where the number of individuals that are added, is varying from, for instance, 10 to 100. As shown by test 2, the larger the number of assertions/individuals, the faster the performance since the optimized expansion rules make the construction of tableau more effective. However, now what happens if the expansion rules at the same time also deal with many non-deterministic rules, that are mostly triggered concepts with constructors at most. Does the general result of Table 2 still, more or less, hold? 
I still don’t know if there is such an ontology that is real, but it’s fine if the test ontology is just artificial.
Test 4 has shown that the new version outperforms the old system when dealing with a query that has 2^n explanations w.r.t. a given ontology. Now, I just wonder how this positive performance can still be attained if the exponentiality part does not only come from the increasing number of justifications but also comes from the increasing size of the model of the ontology that is created during the reasoning process. As we can see in test 4, if the ontology is written in EL, then the constructed (canonical) model is also linear in the size of n. Now, what happens if, for instance, we add more concepts in test 4 that may trigger the size of the model of the ontology to be exponential in the size of n. For instance, if we already have value restrictions and existential restrictions at the same time, then the size of the constructed (canonical) model may become exponential. Would it make sense if there is such an extension of test 4 that also considers such an exponential blow-up of the size of model, in addition to the exponential number of justifications?

After reading this current version, it seems that the main important contributions of this article are the optimizations they proposed and the comparison between the old and the new system based on tests they provided in the evaluation part. But, this means the prolog-based framework, and even the code they presented in the appendix, do not really look like a critical contribution to this article. In other words, without reading in detail how the prolog code looks like, the reader might understand that the new system is competitive with and outperforms the old one in some cases is because of the optimization of the algorithms they defined. So, now I wonder: what makes Prolog really special for the optimizations they introduced? I think the reviewer #3 in his/her comments on the old version said that the use of Prolog is to easily manage the cause of derived assertions/facts. Is this comment one of the reasons why the authors use prolog for this optimization.
In addition, the seemingly minor role of Prolog also somehow makes the title of this article looks not really reflecting what the actual main contributions of this article, as I said before.

Suggestions 2 and 3 above are obviously optional to be done, but it will be interesting to have such a test that also considers those aspects and, then see how the optimized system is really affected by this kind of test.

Small comments:
Line 1 page 3: it has no any transitive sub-role
Line 3 page 3: An SHI or SHIQ KB
Line 36 page 3: Better to write a sentence with words, not a symbol. 
G, which is also called completion graph —> The graph G, which is also called complete graph.
Line 49 page 3: Please define what it means by a monotone Boolean formula. Not all readers understand the difference between monotone and non-monotone ones. You have defined it actually in Line 43 Page 3, but it’s not clear and does not explicitly that the formula built using only conjunction and disjunction is called monotone Boolean formula.
Figure 1 on Page 4: I think \forall (S.C) should be replaced with \forall S.C. Remove the parentheses in that concept to make it more readable.
Line 37 page 12: replace ‘conjunctions of concepts’ with ‘disjunctions of concepts’.
There are a lot of sentences starting with the phrase “let’s, let’s say, or let’s consider” which seems very informal to be put in a journal paper. Could you rephrase such sentences with other phrases that look a bit formal?
Line 32 page 15: ‘have no justification’
Line 22 page 16: What is Test 6?
Line 4 to 14 page 17: This paragraph about don’t care and don’t know non-determinisms has been mentioned before on page 7. I think it’s enough to only write that test 3 is performed to check how the system dealing with non-determinism rules.
Line 37 page 17: replace ‘then’ with ‘than’
Line 18 page 17 column 2: replace the dot with a comma —> … the choice of rules, which is …”
In the last three sentences on page 19, the sentence “with this KB the new version’s average running time is lower than…” repeated twice, with the different number of seconds: 42 and 495. I think one of the sentences should talk about the old one, not the new one.
Line 45 page 21: It seems that there should be the word ‘by’ in-between ‘made’ and ‘following’
Line 23 page 21 column 2: What does it mean by consistent ontologies? Are these ontologies that have a model? Then, it would be better to define them formally since there is no formal definition of them in the preliminary.