Linked Data Completeness: A Systematic Literature Review

Tracking #: 2265-3478

This paper is currently under review
Authors: 
Subhi Issa
Onaopepo Adekunle
Fayçal Hamdi
Samira Si-said Cherfi
Michel Dumontier
Amrapali Zaveri

Responsible editor: 
Agnieszka Lawrynowicz

Submission type: 
Survey Article
Abstract: 
The quality of Linked Data is an important aspect to indicate their fitness for use in an application. Several quality dimensions are identified such as accuracy, completeness, timeliness, provenance, and accessibility, which are used to assess the quality. While many prior studies offer a landscape view of data quality dimensions, here we focus on presenting a systematic literature review for assessing the completeness of Linked Data. We gather existing approaches from the literature and analyze them qualitatively and quantitatively. In particular, we unify and formalize commonly used terminologies across 52 articles related to the completeness dimension of data quality and provide a comprehensive list of methodologies and metrics used to evaluate the different types of completeness. We identified seven types of completeness, including three types that were not previously identified in earlier surveys. We also analyzed nine different tools capable of assessing Linked Data completeness. The aim of this Systematic Literature Review is to provide researchers and data curators a comprehensive and deeper understanding of existing works on completeness and its properties, thereby encouraging further experimentation and development of new approaches focused on completeness as a data quality dimension of Linked Data.
Full PDF Version: 
Tags: 
Under Review