Editorial Board

Editors-in-Chief
Krzysztof Janowicz

Managing Editors
Cogan Shimizu
Eva Blomqvist

Editorial Board
Mehwish Alam
Claudia d’Amato
Stefano Borgo
Boyan Brodaric
Philipp Cimiano
Oscar Corcho
Bernardo Cuenca-Grau
Elena Demidova
Jerome Euzenat
Mark Gahegan
Aldo Gangemi
Anna Lisa Gentile
Rafael Goncalves
Dagmar Gromann
Armin Haller
Aidan Hogan
Katja Hose
Eero Hyvönen
Sabrina Kirrane
Agnieszka Lawrynowicz
Freddy Lecue
Maria Maleshkova
Raghava Mutharaju
Axel Polleres
Guilin Qi
Marta Sabou
Harald Sack
Christoph Schlieder
Stefan Schlobach
Oshani Seneviratne
Cogan Shimizu
Ruben Verborgh
GQ Zhang

Former Editors-in-Chief
Pascal Hitzler

Editorial Assistants
Sanaz Saki Norouzi

Syndicate

On General and Biomedical Text-to-Graph Large Language Models

Submitted by Sergio Consoli on 02/22/2024 - 07:13

Tracking #: 3642-4856

This paper is currently under review

Authors:

Lorenzo Bertolini

Roel Hulsman

Sergio Consoli

Antonio Puertas Gallardo

Mario Ceresa

Responsible editor:

Guest Editors KG Gen from Text 2023

Submission type:

Full Paper

Abstract:

Knowledge graphs and ontologies represent symbolic and factual information that can offer structured and interpretable knowledge. Extracting and manipulating this type of information is a crucial step in complex processes such as human reasoning. While Large Language Models (LLMs) are known to be useful for extracting and enriching knowledge graphs and ontologies, previous work has largely focused on comparing architecture-specific models (e.g. encoder-decoder only) across benchmarks from similar domains. In this work, we provide a large-scale comparison of the performance of certain LLM features (e.g. model architecture and size) and task learning methods (fine-tuning vs. in-context learning (iCL)) on text-to-graph benchmarks in two domains, namely the general and biomedical ones. Experiments suggest that, in the general domain, small fine-tuned encoder-decoder models and mid-sized decoder-only models used with iCL reach overall comparable performance with high entity and relation recognition and moderate yet encouraging graph completion. Our results further tentatively suggest that, independent of other factors, biomedical knowledge graphs are notably harder to learn and better modelled by small fine-tuned encoder-decoder architectures. Pertaining to iCL, we analyse hallucinating behaviour related to sub-optimal prompt design, suggesting an efficient alternative to prompt engineering and prompt tuning for tasks with structured model output.

Full PDF Version:

swj3642.pdf

Tags:

Under Review

Long-term Stable Link to Resources:

https://github.com/jrcf7/txt2graphLLMs

Log in or register to post comments
395 reads

Main menu

Editorial Board

Syndicate

On General and Biomedical Text-to-Graph Large Language Models

Tracking #: 3642-4856

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles

Search form

Main menu

Login

Editorial Board

Syndicate

On General and Biomedical Text-to-Graph Large Language Models

Tracking #: 3642-4856

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles