Data Quality meets Machine Learning and Knowledge Graphs

Topics of interest

New approaches for performing Data quality assessment or improvement of Knowledge Graphs via Machine Learning

Quality assessment over the time
Scalability issues
Proactive approaches able to improve KG quality during the data authoring stage
Reactive approaches to improve KG quality before the data exploitation stage
Large Language Models to deal with KG quality issues
Generative Artificail Intelligence (AI) to cope with KG quality issues
AI-driven approach to assess and improve data quality issues over KGs

Applications combining Machine Learning and Knowledge Graphs dealing with Data Quality concerns:

Recommender Systems leveraging (incomplete) Knowledge Graphs
Link Prediction and completing KGs
Ontology Learning and Matching coping with KG consistency and accuracy
Question Answering exploiting Knowledge Graphs and Machine Learning dealing with representational issues
Domain Specific KGs quality issues

Submission details

Submissions can fall in one of the following categories:

Full research papers (up to 15 pages, excluding references)
Short research papers (up to 8 pages, excluding references)

We welcome contributions presenting

success stories,
negative results,
reviews of the state of the art,
position papers critically discussing what is missing in this alliance, i.e., data quality, ML and KG.

Papers must comply with the CEUR-WS template. Papers are submitted in PDF format via the workshop’s Open Review submission page.

Accepted papers (after blind review of at least 2 experts) will be published by CEUR–WS.

At least one of the authors of the accepted papers must register for the workshop (pre-conference only option) to be included into the workshop proceedings. Information about registration can be found on the ESWC 2024 official page.

Important dates

Paper submission deadline: ~~February 26, 2024~~ March 11, 2024
Notification of Acceptance: March 28, 2024
Camera-ready paper due: April 18, 2024
ESWC 2024 Workshop days: May 26, 2024

Program details and Keynote

09:00 - 09:05 AM > Welcome Session
09:05 - 10:05 AM > KEYNOTE by Elena Simperl


	Elena Simperl is a Professor of Computer at King’s College London and the Director of Research for the Open Data Institute (ODI). She is a Fellow of the British Computer Society and the Royal Society of Arts, and a Hans Fischer Senior Fellow. Elena’s work is at the intersection between AI and social computing. She features in the top 100 most influential scholars in knowledge engineering of the last decade and in the Women in AI 2000 ranking. She is the president of the Semantic Web Sciences Association.

Elena Simperl is a Professor of Computer at King’s College London and the Director of Research for the Open Data Institute (ODI). She is a Fellow of the British Computer Society and the Royal Society of Arts, and a Hans Fischer Senior Fellow. Elena’s work is at the intersection between AI and social computing. She features in the top 100 most influential scholars in knowledge engineering of the last decade and in the Women in AI 2000 ranking. She is the president of the Semantic Web Sciences Association.

Title: When stars align: studies in data quality, knowledge graphs, and machine learning

Abstract: In this talk I will present several projects that tease out the intricate relationship between these three fields of research to produce better AI datasets and, with that, better AI models and downstream applications. I will start with work in knowledge graphs, which are machine-readable structured data representations, organised for general-purpose use. Besides data integration, they are extensively used in search engines, recommender systems, virtual assistants and other AI contexts, commonly as a source of domain knowledge and explanations. I propose sociotechnical methods, drawing on machine learning and other techniques, to understand biases, improve the quality, and increase people’s trust in knowledge graphs. Then I will move to ongoing work on assuring any type of AI dataset, using semantic technologies. I will introduce the data-centric AI programme at the Open Data Institute and deep dive into Croissant, a schema.org-based vocabulary to describe AI datasets to improve their quality and reuse.

Slides: Slides are available online at the following link

10:05 - 10:30 AM > Stefani Tsaneva, Stefan Vasic, Marta Sabou. ‘‘LLM-driven Ontology Evaluation: Verifying Ontology Restrictions with ChatGPT’’ - paper - slides
10:30 - 11:00 AM > Coffee Break
11:00 - 11:25 AM > Jose Emilio Labra Gayo. ‘‘Extending Shape Expressions for different types of knowledge graphs’’ - paper - slides
11:25 - 11:40 AM > Gabriele Tuozzo. ‘‘Moving from Tabular Knowledge Graph Quality Assessment to RDF Triples Leveraging ChatGPT’’ - paper - slides
11:40 - 11:55 AM > Pasquale Esposito. ‘‘The Linguistic Linked Open Data through the Linguists’ Lens’’ - paper - slides
12:00 - 12:30 PM > Panel (Elena, Anastasia, Heiko, Paul)

Organizers


Maria Angela Pellegrino	Anisa Rula	Jose Emilio Labra Gayo	Michael Cochez	Mehwish Alam
University of Salerno	University of Brescia	University of Oviedo	Vrije Universiteit Amsterdam	Institut Polytechnique de Paris
Italy	Italy	Spain	the Netherlands	France