A Framework to Restore Semantically Affected Links in LOD Datasets

Tracking #: 3065-4279

Authors: 
Andre Regino
Julio Cesar dos Reis
Rodrigo Bonacin

Responsible editor: 
Katja Hose

Submission type: 
Full Paper
Abstract: 
Handling the effects of link changes in evolving datasets remains an open research challenge in managing RDF datasets in LOD. This is especially valuable because of the heavy interconnection present in LOD. This investigation proposed the LODMF framework for updating RDF links in a semi-automatic way. We described the detailed steps by including the identification of dataset changes and achieving choice of link maintenance actions. Our study provided an illustrative example fully describing the framework's execution to highlight the potentialities and challenges from our solution for repairing well-established semantically broken links. Future studies involve further experimental evaluations to assess the framework's execution better. We plan to implement complete link maintenance software tools by including the end-user interface for interacting with the candidate resources and maintenance actions. In addition, we intend to conduct a user study to evaluate the long-term use of the tool as well as to create standard databases so that we can execute quantitative and statistical comparisons between maintenance approaches.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Reject

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Rob Brennan submitted on 31/Mar/2022
Suggestion:
Major Revision
Review Comment:

(1) originality,
The paper addresses a relevant topic, automated link maintenance in the web of data and brings novelty to the problem by including automated checks for semantic drift although the basis or effectiveness of these checks are not thoroughly explored. It is unusual that no aspects of machine learning-based NLP or vector representations of semantics are mentioned in 2022.

(2) significance of the results,

The authors outline a general method for performing automated link maintenance but many aspects of the method are based on arbitrary thresholds, mean combinations of multiple scoring mechanisms and opaque similarity measures that lack empirical evidence or justification. The effectiveness of the system is not evaluated in a field trial with a real, diverse linkset but rather a single example is worked through. This is insufficient evidence to assess the effectiveness of the approach and no comparison is provided to similar methods, as even if they work through different mechanisms the overall relative effectiveness of the link maintenance tasks could be reported.

(3) quality of writing.

The paper is well written and generally clear. Some additional editing could improve the clarity of expression. structurally it is missing a significant evaluation section. The references seem appropriate, a couple of additional suggestions are provided below.

= Detailed Comments =

==Introduction

You say "Semantically broken links appear when the semantics of associated resources does not further express the meaning intended by the triple’s author."
This is not exactly clear, it is not precisely defined whether you mean the semantics of the link or the alignment of the semantics of the subject and object. Suggest you re-phrase to clarify.

==Related Work

You say "they [past work] do not focus on how to recover existing broken links."

The SUMMR mapping maintenance framework does address link repair, see
MEEHAN, ALAN, The SPARQL usage for mapping maintenance and reuse methodology, Trinity College Dublin.School of Computer Science & Statistics.COMPUTER SYSTEMS, 2017
http://www.tara.tcd.ie/handle/2262/81715
and
Meehan, A., Kontokostas, D., Freudenberg, M., Brennan, R., O’Sullivan, D. (2016). Validating Interlinks Between Linked Data Datasets with the SUMMR Methodology. In: , et al. On the Move to Meaningful Internet Systems: OTM 2016 Conferences. OTM 2016. Lecture Notes in Computer Science(), vol 10033. Springer, Cham. https://doi.org/10.1007/978-3-319-48472-3_39
https://link.springer.com/chapter/10.1007/978-3-319-48472-3_39

==Sec 3.1=

It would be good to explain how the linkset is determined for a dataset.

==Sec 3.2=

It would be useful to discuss how applicable in practice is your assumption that only 1 dataset evolves at a time.

==Sec 3.5.1

Top k Candidates
Why do you use a simple mean of the comparison types? Do you have any evidence this is the most appropriate?

==3.5.2. Maintenance Actions

Do you have any evidence for how changing Beta from 0.5 changes system performance/behaviour?

In terms of generating all these candidate actions, do you have any evidence for how practical this is for large datasets? This is important because, as you say, automation is key when dealing with huge datasets but if the overhead is too high then the process is not likely to be practical.

=4=

This needs a detailed study of the performance of the system with a whole set of complex links, between multiple datasets. For example Meehan examined
1,673,634 interlink category mappings of the v.2015-10 DBpedia release

=5=
You say "It should also be noted that there are no ready-to-use gold standard datasets for comparative analyses in the specific context of this study."

Please see SUMMR mapping maintenance framework evaluations based on maintaining the whole link set of DBpedia releases.

You say "In addition, existing proposals found in the literature do not deal with the same issue investigated in our framework, which made side-by-side quantitative and statistical comparisons based on objective metrics with other studies impossible at the time."

It is possible to compare the effectiveness of different approaches in terms of the precision and recall of their ability to maintain a link-set, no matter what the mechanism.

You say "We assume that, in our framework, the quality of the results is correlated with the right choice of background knowledge, which is used in the task of finding suitable candidates to replace the broken part of the link."

It would be very useful to quantify this assumption by experiment, even an initial experiment to see the difference between the 2 semantic similarity measures you use.

= Typos and Grammar Suggestions =

==Abstract

maintenance of RDF links consistency -> maintenance of RDF link_ consistency

LODMF - expand

operations on RDF triples deals with affected links to other repositories
- > not clear what you mean from "deals", is there a word missing?

users assistance - > users_'_ assistance

==Introduction

over time a challenging task -> over time _into_ a challenging task

Existing literature on link maintenance -> _The e_xisting literature on link maintenance

Review #2
Anonymous submitted on 31/Mar/2022
Suggestion:
Reject
Review Comment:

This paper introduces a framework for the very important problem of restoring Semantically Affected Links. The paper describes all the methods and the steps of the framework with details, running examples and a real scenario. However, in my opinion, the paper lacks experimental evaluation regarding both the efficiency and effectiveness of the presented framework, whereas the algorithms can be written more formally and their complexities should be analyzed. For this reason, my suggestion is the paper to be rejected, since I believe that the changes and advances required go beyond what could reasonably be expected for a revision.

However, my recommendation to the authors is to improve (predominantly) the section containing the algorithms and the experimental evaluation section, i.e., by adding either real or synthetic datasets, for being able to provide comparative results, or/and by performing a user evaluation (more details are given below).

Advantages

ADV 1. The framework can be important for the problem of restoring Semantically Affected Links.

ADV 2. The figures and examples are quite helpful for understanding the proposed methods.

ADV 3. The real use case scenario is quite important for understanding the steps of LODMF.

Disadvantages

DIS 1. The most important issue is the experimental evaluation part. In my opinion, that part is too weak for being accepted as a journal paper (a single use case is not enough) . Experiments should be done for both efficiency and effectiveness
More details/ideas are given below.

DIS 2. Time and Space complexities of the algorithms are missing.

DIS 3. Algorithms should be described more formally, i.e., not to contain just the names of methods such as (compareLabels, compareTypes, etc.)

DIS 4. Related work section can be improved.

Abstract & Introduction.

The abstract and the introduction is well written, except for some specific parts, which are listed below

Please add a reference to the first paragraph of the introduction, such as “Bizer, Christian, Tom Heath, and Tim Berners-Lee. "Linked data: The story so far." Semantic services, interoperability and web applications: emerging concepts. IGI global, 2011. 205-227.”

There is no need to have so many details for the steps of the approach in the introduction, i.e., “This article proposes the” … “of the datasets”. Please focus on the most important points of your contribution

“we present improvements in the framework” → why?
Please explain the limitations of the previous approach that led you to these improvements, and the novelty of the proposed article.

Section 2. Related Work

Please reorganize the section according to the following comments:
“Our proposed LODMF framework” - LOD should go to the end of the section.
First, analyze the related approaches and add more details for the similarities/differences between your framework and the other tools. Maybe one solution is to add a table for comparing the different tools/frameworks in some dimensions (e.g., input, output, properties that they use for comparing the links, etc).

“Our proposal advances state of the art in addressing how to handle semantic broken links as automatically as possible.” → Please explain the novel part of the proposed work.

Section 3.

The algorithms 1 & 2 contain several methods that are not explained formally. For instance detectChanges, recUnAffecLinks,getInitialCandidates …

Moreover, there is no evidence about the time and space complexity of the algorithms, and the paper does not introduce experiments about the efficiency of LODMF.
Furthermore, one question is why have you chosen to compute the mean. Have you tested several cases and was it the most effective one? Generally, a lot of questions can arise due to the lack of experiments.

Section 4.

The scenario that is introduced in the paper is quite interesting and helpful for understanding the steps of the framework. However, experimental evaluation is missing, which is the main limitation of the article. I can understand that it is quite difficult to create a gold standard, however, there are several alternatives that you should think about: a) user evaluation b) synthetic datasets for checking either simple or complex cases and for comparing the different methods (comparisons) through comparative results, c) real datasets, for checking at least the efficiency of the framework (and which factors can affect the efficiency).

Sections 5 and 6

In my opinion, these two sections can be merged (with less details from Section 5).

Minor points

Section 5 discuss → Section 5 discusses
with label “Niagara Falls” → with the label “Niagara Falls”
such candidate (in the row) presents similarity value → such a candidate (in the row) presents a similarity value
could no be computed → could not be computed
If the user choose → If the user chooses
of similarity computation those resources → of similarity computation of those resources

Review #3
By Heiko Paulheim submitted on 13/Apr/2022
Suggestion:
Reject
Review Comment:

The authors look at the problem of evolving linked open datasets and the impact of that evolution on data interlinking. They state that evolution can lead to outdated dataset interlinks, and propose a method to fix those.

The topic of the paper is relevant and in scope of SWJ, but there are some aspects that deserve attention.

First, and foremost, there is no evaluation of the approach. Without that evaluation, there is basically no evidence that the approach works as desired. At the same time, an evaluation protocol is easily imaginable. One could use a dataset which comes with different versions, such as DBpedia, and run the approach on a few linksets. In a subsequent analysis, one could manually inspect samples of the identified broken links and suggested fixes and determine their accuracy.

In some parts, the paper lacks clarity. For example, section 3.4 does not contain a precise definition of what an affected link should be. Likewise, table 1 suggests that there are some metrics associated with the strategies described in section 3.5 that output numeric scores, but those metrics are never defined.

To begin with, the problem definition should be crisper. What kind of defects should be located by the approach? Are you only looking for *dead links*, i.e., links to resources that do not exist any more in an updated version of a graph, or also links that still exist, but become invalid due to concept drift in one graph, or other similar phenomena?

I also have a hard time understanding the notion of a "changed triple", which is quite essential in the paper. In the example in Fig. 6, this would require that the two triples in R^S^j and R^S^j+1 are somewhat connected, but this is an assumption that is not met by most knowledge graphs (i.e., for DBpedia, you can access the different versions, but there is no information that states that triple X was replaced by triple Y). Consider the following example:
Version 1: t1=
Version 2: t2=
Given that t1 does not exist in version 2, and t2 does not exist in version 1, there are still different interpretations: this might be the correction of a typo which led to a different URI in version 2, but it might also be that Jon_Smith was deleted and John_Smith has been inserted, but they are different persons accidentally born in New York City. There seems no way of determining whether t2 is a replacement of t1 or an entirely new triple.

Likewise, I would have assumed that new evidence about entities is taken into account. Take again the example that the first knowledge graph comes with these two triples:
Version 1: t1=
Version 2: t2=
and the linked graph has the triple , with a link between g1:John_Smith and g2:John_Smith. I would assume that a decent link repair approach would find out that replacing t1 with t2 makes the link less likely, but I do not see that anywhere in the approach.

I also see an oversimplification in only considering triple removal and subject/predicate/object changes. What about triple additions? For example, if in the former example, g1 would not contain t1 at all, with a link between g1:John_Smith and g2:John_Smith, the addition of t2 in the new version would also make the link less likely. This case is not covered in the authors' approach.

The approach seems to have some severe limitations with respect to cases where certain metrics cannot be computed (like suggesting the category as a replacement candidate in the running example). This can be fixed in two respects: (a) treat missing links differently (e.g., by replacing them using missing value imputation), and changing the final score computation from a simple mean ignoring missing values by adding a penalty term for missing scores.

The paper is also not to easy to follow in certain parts, and could be improved with respect to readability. Generally, it would help to use triple notation like throughout. With that, some formulas could be rewritten in a more accessible form, e.g., the ModifiedSubjectTriple in 3.3. could be rewritten as \in R^S^j, in R^S^j+1, o!=o'. (at the same time, this definition is not sufficient, since for a *modified* subject, one would also have to demand that \notin R^S^j+1, \not R^S^j.

The related work section also seems incomplete. Relevant unconsidered works include, but are not limited to:
* Halpin et al. (2010): When owl: sameas isn't the same: An analysis of identity in linked data
* Paulheim (2014): Identifying wrong links between datasets by multi-dimensional outlier detection.
* Yaghouti et al. (2015): A Metric-driven Approach for Interlinking Assessment of RDF Graphs
* Valdestilhas et al. (2017): Cedal: time-efficient detection of erroneous links in large-scale link repositories
* Paris (2018): Assessing the Quality of owl:sameAs Links
* Raad et al. (2018): Detecting erroneous identity links on the web using network metrics
* da Silva et al. (2020): Dissimilarity-based approach for Identity Link Invalidation

Minor points:
* Lines 8 and 9 in Algorithm 1 should be replaced by calls to the actual functions (getCandidates, suggestAction) rather than Algorithm2 and Algorithm3
* listing 1 is rather trivial and could be left out; readers of SWJ should know how to retrieve an instance's label via SPARQL
* on p.14, it says "some maintainer changed...", but DBpedia is not created like that and has no such maintainers.
* The shortcut in the table 2 caption ("http://dbpedia.org.br/resource") does not exist.

Overall, my assessment is that this paper is not mature enough for a publication in SWJ.