|Review Comment: |
This survey provides an overview on the use of background knowledge in ontology matching, by classifying and organizing existing approaches, and analyzing the results they have obtained throughout the years in the Ontology Alignment Evaluation Initiative (OAEI).
With respect to the relevance of the topic, the use of background knowledge is extremely important to the Ontology Matching community, and to a lesser degree, to the broader Semantic Web community. It has been the target of sufficient original research work that I believe a survey of the topic is merited and would be useful to the community.
Regarding the manuscript, in terms of form, I found it well organized and clearly written, despite a number of grammar and word-usage errors which need correcting. In terms of content, I found it fairly comprehensive in its coverage, and accessible as an introductory text on the topic. However, there are several issues that need addressing before the manuscript is acceptable for publication.
Content Issues (in the order in which they appear in the text):
- Section 2.1; Background Knowledge: "we can define it as any external resource of knowledge which the use can improve the matching result"
This definition is erroneous: it corresponds exactly to the definition of a "useful BK" given later in the paper, and clearly the definition of "BK" and "useful BK" cannot be the same. While the intent of using BK is indeed to improve matching results, intent and outcome are not always coincidental. A knowledge source doesn't cease to be BK if it isn't proven useful, and defining it as such wouldn't be very helpful, because BK usefulness can only be proven a posteriori in the cases where a reference alignment exists.
What can be said about BK in OM is that it is any external knowledge resource that provides information, be it lexical or semantic, about the domain(s) of the ontology matching problem or some of the entities therein.
- Section 2.2; first paragraph: "the simple use of lexical resources such as WordNet to enrich concepts with synonyms before matching ontologies is not considered as BK-based matching in this paper"
I'm puzzled by this statement as it seemingly contradicts the authors' broad definition of BK given previously. Why wouldn't the use of BK for lexical enrichment qualify as BK-based matching? This division between the use of BK for lexical enrichment and its use as an intermediate seems arbitrary and artificial, as the two approaches are to some degree interchangeable, and will often lead to similar results. For instance, say that ontology S has the concept "hair" and ontology T has the concept "fur", and that "fur" and "hair" are WordNet synonyms. I can use WordNet as a lexical resource to enrich S and T by adding "fur" as a synonym of "hair" and "hair" as a synonym of "fur" respectively, which will enable me to map the two concepts. However, I can also use WordNet as an intermediate "ontology" (treating its synsets as classes) which will enable me to find: S(hair) = WordNet(hair) <=> WordNet(fur) = S(fur); thereby arriving at the same mapping. And please note that this duality is not exclusive to the WordNet: BK ontologies may generally be used as intermediates, but it is also possible to use them for lexical enrichment. In fact, this is something done by AML (using UBERON) which accounts in part for its success in the OAEI's Anatomy track.
While I understand that the authors wish the main focus of the paper to be on the "standard" use of BK sources as intermediates, they must also provide some coverage of the use of BK for lexical enrichment if they wish their survey to be comprehensive. Even if they still wish to differentiate between the two approaches, they should at least acknowledge and describe the latter, rather than outright excluding it from the manuscript.
- Section 2.2: definitions of "useful BK" and "BK selection".
The definitions given are correct only in the context of using a single BK source. In a multi-BK setting, "BK selection" aims at finding the best combination of BK sources, and the concept of usefulness must consider not only the direct alignment, but also the alignment using other BK sources. Since the authors do contemplate the multi-BK setting in section 4.2, they should provide definitions contemplating the multi-BK setting in addition to those for the single BK setting at this stage.
- Section 3.2.1: " The size of the BK is generally large with respect to the ontologies to be aligned."
This is doubtful and unsupported. In the common case of using an ontology as BK, there is no reason to assume that the BK ontology will be any larger than the ontologies being matched, and I wouldn't expect it to be so on average. It may indeed be larger in some cases (e.g., if you use a broad multi-domain biomedical resource such as the UMLS metathesaurus as BK) but it may also be smaller - e.g., when the BK source is an upper level ontology or when using BK sources that cover only a small part of the large ontologies being matched (e.g., in the SNOMED-NCI matching problem from the OAEI) - or be of approximately the same dimension.
- Section 5 - Table1
Not all the BK selection methods listed and discussed throughout the text are summarized in Table 1. Namely, references , , , and  are left out.
- Section 6.1: "AML-BK does not dynamically select its BK but uses a preselected BK that offer a good mapping gain score."
This is untrue. While there is a "preselection" of sorts in the sense that the BK sources available to AML are only a few, AML does test each of them in all biomedical tracks of the OAEI, using the mapping gain methodology to select which source(s) to use in each case. Within the set of available BK sources, there is no preselection of which BK sources are used in each track - for instance, DOID is only useful in SNOMED-NCI but it is tested by AML in all other biomedical tasks. However, the process is deterministic, and thus we know what BK sources will be selected in each track after testing the algorithm, which is why we listed them in the OAEI paper.
Please correct this statement and revise all subsequent statements on this subject, namely in Section 6.4 and in the last paragraph of the conclusions.
- Section 6.5; first paragraph
While increasing recall is indeed the goal for which we use BK, and some loss in precision can be expected in general by using BK, the authors are overlooking a key piece of the puzzle in this discussion of the OAEI results: the fact that the LargeBio reference alignments are derived from the UMLS Metathesaurus and thus not actual gold standards. Concretely, in the task where the greatest losses in precision are observed (Task2), the reason for these losses is that the NCI Thesaurus includes a small branch on mouse anatomy in addition to its branch on human anatomy, and that by using UBERON as BK, tools obtain mappings between both these branches and the FMA. Now, because the UMLS is focused on human health, it does not include mappings between the NCI's mouse anatomy branch and FMA, but that doesn't mean that such mappings are incorrect. Being a cross-species anatomy ontology, UBERON does include them, and explicitly at that, in the form of cross-references (which are essentially mappings, and are manually curated). Thus, the substantial loss in precision that some tools experience in Task2 is mostly artificial, due to the incompleteness of the UMLS-derived reference alignment.
This does not mean that the authors' final statement of the paragraph is not valid. But they should be careful not to give too much highlight to the issue of precision. Also, the final statement needs to be elaborated as it is too shallow for a non-expert to follow.
- Section 6.5; second paragraph
Again, this discussion is too shallow. Faced with the evidence that the biomedical domain is the main target of BK usage in the OAEI, the authors should first explain why that is before drawing any conclusions. Concretely, in the biomedical domain there is a marriage of necessity and opportunity: on the one hand, the vocabulary is complex and specialized, which limits the effectiveness of generic lexical resources such as the WordNet; on the other hand there is an abundance of ontologies with overlapping domains, which can be exploited as BK. This combination of factors is particular to the biomedical domain, and thus it does not make sense to expect a comparable use of BK in other domains.
Furthermore, it is not true that BK is not used in other OAEI tracks. For instance, the WordNet has been used by a number of tools in the Conference track. However, the authors have (erroneously, as I discussed above) discarded this use of the WordNet as not being "BK-based matching". I don't really understand what sources of BK other than the WordNet the authors would expect to be useful in this domain.
I also don't understand why "BK-based matching" approaches being adopted in other OAEI tracks would allow for a better evaluation. I agree that it would allow for a broader evaluation, but "better" suggests that there is something lacking in the evaluation in the biomedical domain, which is untrue - it is unquestionable that these approaches are effective in that domain.
- Section 7; question 2:
Here, again, the authors are discarding the use of the WordNet for lexical enrichment as a valid usage of BK when they state that BK is only used in the biomedical tracks. The reasons behind the focus of BK approaches on the biomedical domain, which I detailed in the previous point, should be used to elaborate and clarify this section.
- Section 7; question 3:
The conclusion of the authors is based upon a false premise. AML does not implement less similarity metrics than AgreementMaker employed in the Anatomy track; it implements equivalent metrics but does so more efficiently. Thus, AML's workflow for Anatomy is not simpler than AgreementMaker's, it merely is computationally more efficient. Moreover, the effort towards a more efficient workflow was not motivated by the usage of BK, but because AgreementMaker could not cope with the LargeBio ontologies.
It should also be noted that AML obtained very solid results in Anatomy even without the use of BK (0.886 F-measure, higher than AgreementMaker's best result without BK, from 2010), and the same is also true for LargeBio as well as for other tracks. Thus, while it is true that a large part of AML's success in biomedical tracks is due to its use of BK, and I would also agree that the use of adequate BK can replace more complex matching algorithms, the authors cannot infer the latter from the former.
- Section 7; question 4:
The part about AML having selection done a priori is erroneous, as detailed above. AML employs the algorithm detailed in reference  which has linear time complexity with regard to the size of the ontologies to test as BK. AML does benefit from having a small universe of ontologies to choose from in its OAEI configuration, whereas LogMap has to cope with all of BioPortal, but nevertheless, algorithmically, the two approaches should be comparable. Additionally, LogMapBio is slow not only because of the number of ontologies it considers, but also because it relies on BioPortal's API to access them and this creates a bottleneck in performance.
- The BK acronym:
If the authors define BK as being "Background Knowledge" (the concept), then they should refer to specific objects as "BK (re)sources" rather than simply BK. Alternatively, they should redefine BK to mean "Background Knowledge (re)source" and abstain from using it to refer to the general concept. Using the acronym both to refer to the concept of BK and to the objects that are instances of that concept makes the text confusing.
- Grammar & word-usage errors:
There are a number of these throughout the document, and it needs a careful revision by a fluent English speaker.
In the abstract alone:
"active since a decade" -> "active FOR a decade"
"in which cases the use of the background knowledge is justified and necessary?" -> "in which cases IS the use of the background knowledge justified and necessary?"
The former is the most common grammar error in the manuscript, occurring multiple times - failure to observe subject-auxiliary inversion in interrogative statements.
"regarding to the systems" > either "regarding the systems" or "with regard to the systems"