XML Grammar Matching and Comparison: Technical Report

XML grammar matching has found considerable interest recently, due to the growing number of heterogeneous XML documents on the web, and the increasing need to integrate, and consequently search and retrieve XML documents originated from different data sources. In this study, we provide an approach f...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Tekli, Joe (author)
مؤلفون آخرون: Chbeir, Richard (author), Yetongnon, Kokou (author)
التنسيق: conferenceObject
منشور في: 2017
الوصول للمادة أونلاين:http://hdl.handle.net/10725/5880
http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php
https://www.researchgate.net/publication/228846206_XML_Grammar_Matching_and_Comparison_Technical_Report
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
الوصف
الملخص:XML grammar matching has found considerable interest recently, due to the growing number of heterogeneous XML documents on the web, and the increasing need to integrate, and consequently search and retrieve XML documents originated from different data sources. In this study, we provide an approach for automatic XML matching and comparison aiming to minimize the amount of user effort required to perform the match task. We propose an extensible framework based on the concept of tree edit distance, integrating different matching criterions so as to capture XML grammar element semantic and syntactic similarities, cardinality and alternativeness constraints, as well as data-type correspondences and relative ordering. Our method is not bound to any specific XML grammar language (e.g., DTD or XSD), and covers all basic operators and constraints. In addition, our framework is flexible, enabling the user to choose mapping cardinality (i.e., 1:1, 1:n, n:1, n:n), in comparison with exiting static methods (usually constrained to 1:1). User constraints and feedback are equally considered in order to adjust matching results to the user's perception of correct matches. A prototype has been developed to evaluate and test our approach. Experiments on real and synthetic XML grammars demonstrate the efficiency of our matching strategy in identifying mappings, in comparison with alternative methods, while timing results underline the impact of semantic similarity evaluation on overall system performance. Hereunder, we develop the various matchers exploited in our study and present detailed experimental matching results (summarized in the main paper).