RDFFrames: knowledge graph access for machine learning tools

<p>Knowledge graphs represented as RDF datasets are integral to many machine learning applications. RDF is supported by a rich ecosystem of data management systems and tools, most notably RDF database systems that provide a SPARQL query interface. Surprisingly, machine learning tools for knowl...

Full description

Saved in:
Bibliographic Details
Main Author: Aisha Mohamed (5152970) (author)
Other Authors: Ghadeer Abuoda (14150706) (author), Abdurrahman Ghanem (12119838) (author), Zoi Kaoudi (14150709) (author), Ashraf Aboulnaga (14150712) (author)
Published: 2021
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1864513567972130816
author Aisha Mohamed (5152970)
author2 Ghadeer Abuoda (14150706)
Abdurrahman Ghanem (12119838)
Zoi Kaoudi (14150709)
Ashraf Aboulnaga (14150712)
author2_role author
author
author
author
author_facet Aisha Mohamed (5152970)
Ghadeer Abuoda (14150706)
Abdurrahman Ghanem (12119838)
Zoi Kaoudi (14150709)
Ashraf Aboulnaga (14150712)
author_role author
dc.creator.none.fl_str_mv Aisha Mohamed (5152970)
Ghadeer Abuoda (14150706)
Abdurrahman Ghanem (12119838)
Zoi Kaoudi (14150709)
Ashraf Aboulnaga (14150712)
dc.date.none.fl_str_mv 2021-08-26T06:00:00Z
dc.identifier.none.fl_str_mv 10.1007/s00778-021-00690-5
dc.relation.none.fl_str_mv https://figshare.com/articles/journal_contribution/RDFFrames_knowledge_graph_access_for_machine_learning_tools/21597114
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Information and computing sciences
Data management and data science
Machine learning
Knowledge graphs
RDF
SPARQL
PyData
Data preparation
Machine learning
dc.title.none.fl_str_mv RDFFrames: knowledge graph access for machine learning tools
dc.type.none.fl_str_mv Text
Journal contribution
info:eu-repo/semantics/publishedVersion
text
contribution to journal
description <p>Knowledge graphs represented as RDF datasets are integral to many machine learning applications. RDF is supported by a rich ecosystem of data management systems and tools, most notably RDF database systems that provide a SPARQL query interface. Surprisingly, machine learning tools for knowledge graphs do not use SPARQL, despite the obvious advantages of using a database system. This is due to the mismatch between SPARQL and machine learning tools in terms of data model and programming style. Machine learning tools work on data in tabular format and process it using an imperative programming style, while SPARQL is declarative and has as its basic operation matching graph patterns to RDF triples. We posit that a good interface to knowledge graphs from a machine learning software stack should use an imperative, navigational programming paradigm based on graph traversal rather than the SPARQL query paradigm based on graph patterns. In this paper, we present RDFFrames, a framework that provides such an interface. RDFFrames provides an imperative Python API that gets internally translated to SPARQL, and it is integrated with the PyData machine learning software stack. RDFFrames enables the user to make a sequence of Python calls to define the data to be extracted from a knowledge graph stored in an RDF database system, and it translates these calls into a compact SPQARL query, executes it on the database system, and returns the results in a standard tabular format. Thus, RDFFrames is a useful tool for data preparation that combines the usability of PyData with the flexibility and performance of RDF database systems.</p><h2>Other Information</h2> <p> Published in: The VLDB Journal<br> License: <a href="https://creativecommons.org/licenses/by/4.0" target="_blank">https://creativecommons.org/licenses/by/4.0</a><br>See article on publisher's website: <a href="http://dx.doi.org/10.1007/s00778-021-00690-5" target="_blank">http://dx.doi.org/10.1007/s00778-021-00690-5</a></p>
eu_rights_str_mv openAccess
id Manara2_ee06183d581068c4595eba110f3e3ac9
identifier_str_mv 10.1007/s00778-021-00690-5
network_acronym_str Manara2
network_name_str Manara2
oai_identifier_str oai:figshare.com:article/21597114
publishDate 2021
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling RDFFrames: knowledge graph access for machine learning toolsAisha Mohamed (5152970)Ghadeer Abuoda (14150706)Abdurrahman Ghanem (12119838)Zoi Kaoudi (14150709)Ashraf Aboulnaga (14150712)Information and computing sciencesData management and data scienceMachine learningKnowledge graphsRDFSPARQLPyDataData preparationMachine learning<p>Knowledge graphs represented as RDF datasets are integral to many machine learning applications. RDF is supported by a rich ecosystem of data management systems and tools, most notably RDF database systems that provide a SPARQL query interface. Surprisingly, machine learning tools for knowledge graphs do not use SPARQL, despite the obvious advantages of using a database system. This is due to the mismatch between SPARQL and machine learning tools in terms of data model and programming style. Machine learning tools work on data in tabular format and process it using an imperative programming style, while SPARQL is declarative and has as its basic operation matching graph patterns to RDF triples. We posit that a good interface to knowledge graphs from a machine learning software stack should use an imperative, navigational programming paradigm based on graph traversal rather than the SPARQL query paradigm based on graph patterns. In this paper, we present RDFFrames, a framework that provides such an interface. RDFFrames provides an imperative Python API that gets internally translated to SPARQL, and it is integrated with the PyData machine learning software stack. RDFFrames enables the user to make a sequence of Python calls to define the data to be extracted from a knowledge graph stored in an RDF database system, and it translates these calls into a compact SPQARL query, executes it on the database system, and returns the results in a standard tabular format. Thus, RDFFrames is a useful tool for data preparation that combines the usability of PyData with the flexibility and performance of RDF database systems.</p><h2>Other Information</h2> <p> Published in: The VLDB Journal<br> License: <a href="https://creativecommons.org/licenses/by/4.0" target="_blank">https://creativecommons.org/licenses/by/4.0</a><br>See article on publisher's website: <a href="http://dx.doi.org/10.1007/s00778-021-00690-5" target="_blank">http://dx.doi.org/10.1007/s00778-021-00690-5</a></p>2021-08-26T06:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1007/s00778-021-00690-5https://figshare.com/articles/journal_contribution/RDFFrames_knowledge_graph_access_for_machine_learning_tools/21597114CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/215971142021-08-26T06:00:00Z
spellingShingle RDFFrames: knowledge graph access for machine learning tools
Aisha Mohamed (5152970)
Information and computing sciences
Data management and data science
Machine learning
Knowledge graphs
RDF
SPARQL
PyData
Data preparation
Machine learning
status_str publishedVersion
title RDFFrames: knowledge graph access for machine learning tools
title_full RDFFrames: knowledge graph access for machine learning tools
title_fullStr RDFFrames: knowledge graph access for machine learning tools
title_full_unstemmed RDFFrames: knowledge graph access for machine learning tools
title_short RDFFrames: knowledge graph access for machine learning tools
title_sort RDFFrames: knowledge graph access for machine learning tools
topic Information and computing sciences
Data management and data science
Machine learning
Knowledge graphs
RDF
SPARQL
PyData
Data preparation
Machine learning