Approximate XML structure validation technical report

Comparing XML documents with XML grammars, also known as XML document and grammar validation, is useful in various scenarios and applications such as: XML document classification, document transformation, grammar evolution, XML retrieval, and the selective dissemination of information. While exact (...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Tekli, Joe (author)
مؤلفون آخرون: Chbeir, Richard (author), Traina, Caetano Jr. (author), Traina, Agma J. M. (author), Fileto, Renato (author)
التنسيق: article
منشور في: 2014
الوصول للمادة أونلاين:http://hdl.handle.net/10725/5881
http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php
https://www.semanticscholar.org/paper/Approximate-XML-Structure-Validation-Technical-%E2%80%93-Tekli-Chbeir/252b2b3540c90966d6591e26c7534fdb391db945
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1864513478179422208
author Tekli, Joe
author2 Chbeir, Richard
Traina, Caetano Jr.
Traina, Agma J. M.
Fileto, Renato
author2_role author
author
author
author
author_facet Tekli, Joe
Chbeir, Richard
Traina, Caetano Jr.
Traina, Agma J. M.
Fileto, Renato
author_role author
dc.creator.none.fl_str_mv Tekli, Joe
Chbeir, Richard
Traina, Caetano Jr.
Traina, Agma J. M.
Fileto, Renato
dc.date.none.fl_str_mv 2014
2017-07-06T09:36:16Z
2017-07-06T09:36:16Z
dc.identifier.none.fl_str_mv http://hdl.handle.net/10725/5881
Tekli, J., Chbeir, R., Caetano Jr Traina, A. J., & Fileto, R. (2014). Approximate XML Structure Validation. Technical Report
http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php
https://www.semanticscholar.org/paper/Approximate-XML-Structure-Validation-Technical-%E2%80%93-Tekli-Chbeir/252b2b3540c90966d6591e26c7534fdb391db945
dc.language.none.fl_str_mv en
dc.rights.*.fl_str_mv info:eu-repo/semantics/openAccess
dc.title.none.fl_str_mv Approximate XML structure validation technical report
dc.type.none.fl_str_mv Article
info:eu-repo/semantics/publishedVersion
info:eu-repo/semantics/article
description Comparing XML documents with XML grammars, also known as XML document and grammar validation, is useful in various scenarios and applications such as: XML document classification, document transformation, grammar evolution, XML retrieval, and the selective dissemination of information. While exact (Boolean) XML validation has been extensively investigated in the literature, the more general problem of approximate (similarity-based) XML validation, i.e., document-grammar similarity evaluation, has not yet received strong attention. In this paper, we propose an original method for measuring the structural similarity between an XML document and an XML grammar (DTD or XSD), considering their most common operators that designate constraints on the existence, repeatability and alternativeness of XML elements/attributes (e.g., ?, *, MinOccurs, MaxOccurs, etc.). Our approach exploits the concept of tree edit distance, introducing a novel edit distance recurrence and dedicated algorithms to effectively compare XML documents and grammar structures, modeled as ordered labeled trees. Our method also inherently performs exact validation by imposing a maximum similarity threshold (minimum edit distance) on the returned results. We implemented a prototype and conducted several experiments on large sets of real and synthetic XML documents and grammars. Results underline our approach’s effectiveness in classifying similar documents with respect to predefined grammars, accuratly detecting document and/or grammar modifications, and performing document and grammar relevance ranking. Time and space analysis were also conducted. This technical report contains only proofs, computation examples, and several experimental results.
eu_rights_str_mv openAccess
format article
id LAURepo_7ece8b13680aa6d211f6480604eefbb1
identifier_str_mv Tekli, J., Chbeir, R., Caetano Jr Traina, A. J., & Fileto, R. (2014). Approximate XML Structure Validation. Technical Report
language_invalid_str_mv en
network_acronym_str LAURepo
network_name_str Lebanese American University repository
oai_identifier_str oai:laur.lau.edu.lb:10725/5881
publishDate 2014
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
spelling Approximate XML structure validation technical reportTekli, JoeChbeir, RichardTraina, Caetano Jr.Traina, Agma J. M.Fileto, RenatoComparing XML documents with XML grammars, also known as XML document and grammar validation, is useful in various scenarios and applications such as: XML document classification, document transformation, grammar evolution, XML retrieval, and the selective dissemination of information. While exact (Boolean) XML validation has been extensively investigated in the literature, the more general problem of approximate (similarity-based) XML validation, i.e., document-grammar similarity evaluation, has not yet received strong attention. In this paper, we propose an original method for measuring the structural similarity between an XML document and an XML grammar (DTD or XSD), considering their most common operators that designate constraints on the existence, repeatability and alternativeness of XML elements/attributes (e.g., ?, *, MinOccurs, MaxOccurs, etc.). Our approach exploits the concept of tree edit distance, introducing a novel edit distance recurrence and dedicated algorithms to effectively compare XML documents and grammar structures, modeled as ordered labeled trees. Our method also inherently performs exact validation by imposing a maximum similarity threshold (minimum edit distance) on the returned results. We implemented a prototype and conducted several experiments on large sets of real and synthetic XML documents and grammars. Results underline our approach’s effectiveness in classifying similar documents with respect to predefined grammars, accuratly detecting document and/or grammar modifications, and performing document and grammar relevance ranking. Time and space analysis were also conducted. This technical report contains only proofs, computation examples, and several experimental results.N/A2017-07-06T09:36:16Z2017-07-06T09:36:16Z2014Articleinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://hdl.handle.net/10725/5881Tekli, J., Chbeir, R., Caetano Jr Traina, A. J., & Fileto, R. (2014). Approximate XML Structure Validation. Technical Reporthttp://libraries.lau.edu.lb/research/laur/terms-of-use/articles.phphttps://www.semanticscholar.org/paper/Approximate-XML-Structure-Validation-Technical-%E2%80%93-Tekli-Chbeir/252b2b3540c90966d6591e26c7534fdb391db945eninfo:eu-repo/semantics/openAccessoai:laur.lau.edu.lb:10725/58812025-03-27T13:14:02Z
spellingShingle Approximate XML structure validation technical report
Tekli, Joe
status_str publishedVersion
title Approximate XML structure validation technical report
title_full Approximate XML structure validation technical report
title_fullStr Approximate XML structure validation technical report
title_full_unstemmed Approximate XML structure validation technical report
title_short Approximate XML structure validation technical report
title_sort Approximate XML structure validation technical report
url http://hdl.handle.net/10725/5881
http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php
https://www.semanticscholar.org/paper/Approximate-XML-Structure-Validation-Technical-%E2%80%93-Tekli-Chbeir/252b2b3540c90966d6591e26c7534fdb391db945