Editorial Board

Editors-in-Chief
Krzysztof Janowicz

Managing Editors
Cogan Shimizu
Eva Blomqvist

Editorial Board
Mehwish Alam
Claudia d’Amato
Stefano Borgo
Boyan Brodaric
Philipp Cimiano
Oscar Corcho
Bernardo Cuenca-Grau
Elena Demidova
Jerome Euzenat
Mark Gahegan
Aldo Gangemi
Anna Lisa Gentile
Rafael Goncalves
Dagmar Gromann
Armin Haller
Aidan Hogan
Katja Hose
Eero Hyvönen
Sabrina Kirrane
Agnieszka Lawrynowicz
Freddy Lecue
Maria Maleshkova
Raghava Mutharaju
Axel Polleres
Guilin Qi
Marta Sabou
Harald Sack
Christoph Schlieder
Stefan Schlobach
Oshani Seneviratne
Cogan Shimizu
Ruben Verborgh
GQ Zhang

Former Editors-in-Chief
Pascal Hitzler

Editorial Assistants
Sanaz Saki Norouzi

Syndicate

Distributional methods for extracting common sense knowledge by ranking triples according to prototypicality

Submitted by Soufian Jebbara on 09/04/2017 - 07:37

Tracking #: 1713-2925

A new version of this paper is available

Authors:

Soufian Jebbara

Valerio Basile

Elena Cabrio

Philipp Cimiano

Responsible editor:

Guest Editors ML4KBG 2016

Submission type:

Full Paper

Abstract:

In this paper we are concerned with developing information extraction models that support the extraction of common sense knowledge from unstructured datasets. Our motivation is to extract manipulation relevant-knowledge that can support robots’ action planning. We frame the task as a relation extraction task and, as proof-of-concept, validate our method on the task of extracting two types of relations: locative and instrumental relations. The locative relation relates objects to the prototypical places where the given object is found or stored. The second instrumental relation relates objects to their prototypical purpose of use. While we extract these relations from text, our goal is not to extract specific mentions, but rather, given an object as input, extract a ranked list of locations and uses ranked by ‘prototypicalyity’. We use distributional methods in embedding space, relying on the well-known skip-gram model to embed words into a low-dimensional distributional space, using cosine similarity to rank the various candidates. In addition to using embeddings computed using the skip-gram model, we also present experiments that rely on the so called NASARI vectors, which rely on disambiguated concepts to compute embeddings and are thus semantically aware. While this distributional approach has been published before, we extend our framework by additional methods relying on neural networks that learn a score to judge whether a given candidate pair actually expresses a desired relation. The network thus learns a scoring function using a supervised approach. While we use a ranking-based evaluation, the supervised model is trained using a binary classification task. The resulting score from the neural network and the cosine similarity in the case of the distributional approach are both used to compute a ranking. We compare the different approaches and parameterizations thereof on the task of extracting the above mentioned relations. We show that the distributional similarity approach performs very well on the task. The best performing parameterization achieves an NDCG of 0.913, a Precision@1 of 0.400 and a Precision@3 of 0.423. The performance of the supervised learning approach, in spite of having being trained on positive and negative examples on the relation in question, is not as good as expected and achieves an NCDG of 0.908, a Precision@1 of 0.454 and a Precision@3 of 0.387, respectively.

Full PDF Version:

swj1713.pdf

Revised Version:

Extracting common sense knowledge via triple ranking using supervised and unsupervised distributional models

Previous Version:

Distributional and Neural Models for Extracting Manipulation-Relevant Relations from Text Corpora

Tags:

Reviewed

Decision/Status:

Minor Revision

Solicited Reviews:

Click to Expand/Collapse

Review #1

By Dagmar Gromann submitted on 03/Oct/2017

Suggestion:
Minor Revision

Review Comment:

Thank you for your responses and comments and the interesting revised version of the paper. I appreciate the difficulty to include all previous approaches since this topic is broad and has been tackled from many different perspectives.

Some of my reservations persist:
- while I appreciate the fact that you answer my questions in the comments, those details (number of participants in the crowdsourcing, judgements, etc.) are not only interesting to me and should clearly be provided in the paper as well.
- changing the title alone does not resolve the claims in the paper that you extract from unstructured data which clearly you do not (this also applies to the argumentation about "explicit mentions in text" which is quite unclear in the introduction) - this is misleading and needs to be changed in a final version
- in spite of the claim in the paper, the methods seem to be not fully generalizable since they rely on location/use annotations in available knowledge bases; this reservation is partially attributable to the fact that only two types of relations were considered and thus judgements about generalizability are difficult

Title:
While the change in title now reflects the content better, it does, however, not acknowledge the third approach (the main contribution of this publication in comparison to previous publications) since it talks only about the two distributional models.

References:
I think something went wrong with the encoding of reference [34]
=> "journal = ACM Transactions on Speech and Language Processing, volume = 8, number = 3, pages = 4-6, year = 2011"

Minor comments in order of appearance starting with abstract:
"mentions" sounds a little strange => occurrences?
"In addition to using embeddings computed using the skip-gram model," => "In addition,"
"While we use a ranking-based evaluation, the supervised model is trained using a binary classification task." => "while" is used for contrasting and there is no contrast in this sentence
"The answers... involves" => involve (should this not be require?)
"does not perform as good" => "well"
"prototypicallity" => "prototypicality"
"on the one hand based on a crowdsourcing..." => this linker serves no function in this paragraph
"while there has been a lot of work" => "while" is used for contrasting and there is no contrast in this sentence
"Such techniques are related to techniques"
"prototypical triples are assigned a higher score than aprototypical triples" => this is not a binary classification but rather a graded function, isn't it?
"In the previous section, we motivated the use" => the "previous section" is not previous but the supersection (3) of this present subsection (3.1)
"a worth of" => a wealth of
"of our approaches to other kind of relations" => kinds
"have shown that both an approach" => ?
"anonymour" => anonymous

Review #2

By Ziqi Zhang submitted on 16/Oct/2017

Suggestion:
Accept

Review Comment:

The authors have addressed many issues raised and their explanations for the remaining problems are also reasonable. I think the paper is in an acceptable state though the authors should do a final proof read to correct some typos.

One main issue that was not addressed is the genericity of the proposed method, i.e., can it be applied to relations other than the two evaluated in this work. Although it is acceptable to leave it at the current state of the work due to the short time frame given for the revision, I think the authors should at least discuss this in future work. Be specific, give examples of other relations that can benefit from this work. As I don't think it is clear to every reader that what these relations could be.

Review #3

By Jedrzej Potoniec submitted on 17/Oct/2017

Suggestion:
Accept

Review Comment:

Below provided is a short summary along the main reviewing dimensions and the detailed remarks follow.
* originality: The paper is an extension of a conference paper from EKAW 2016, but there is enough new contribution for a journal paper.
* significance of the results: There results are interesting and offer an advance in the area of relation extraction.
* quality of writing: The manuscript is well written.

Overall, I am very happy to see most of my remarks addressed. The reviewed paper reads very well and I think it is ready to be published. I have only few minor comments that should be addressed in the camera ready, but I do not think another round of reviews would be necessary:
* Abstract and introduction speak about "binary classification", but Section 3.3 about regression.
* I am a bit surprised that the passage about using Dropout disappeared from the text. I guess it is an omission and it should be restored.
* The queries for Section 4.1 are presented only in the response letter, but I think they should be presented also in the paper.

Log in or register to post comments
5301 reads

Main menu

Editorial Board

Syndicate

Distributional methods for extracting common sense knowledge by ranking triples according to prototypicality

Tracking #: 1713-2925

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles

Search form

Main menu

Login

Editorial Board

Syndicate

Distributional methods for extracting common sense knowledge by ranking triples according to prototypicality

Tracking #: 1713-2925

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles