XBeGene: Scalable XML Documents Generator by Example Based on Real Data

XML datasets of various sizes and properties are needed to evaluate the correctness and efficiency of XML-based algorithms and applications. While several downloadable datasets can be found online, these are predefined by system experts and might not be suitable to evaluate every algorithm. Tools fo...

Full description

Saved in:
Bibliographic Details
Main Author: Harazaki, Manami (author)
Other Authors: Tekli, Joe (author), Yokoyama, Shohei (author), Fukuta, Naoki (author), Chbeir, Richard (author), Ishikawa, Hiroshi (author)
Format: conferenceObject
Published: 2012
Online Access:http://hdl.handle.net/10725/5869
http://dx.doi.org/10.1007/978-3-642-28807-4_63
http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php
https://link.springer.com/chapter/10.1007/978-3-642-28807-4_63
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1864513478164742144
author Harazaki, Manami
author2 Tekli, Joe
Yokoyama, Shohei
Fukuta, Naoki
Chbeir, Richard
Ishikawa, Hiroshi
author2_role author
author
author
author
author
author_facet Harazaki, Manami
Tekli, Joe
Yokoyama, Shohei
Fukuta, Naoki
Chbeir, Richard
Ishikawa, Hiroshi
author_role author
dc.contributor.none.fl_str_mv Gaol, Ford Lumban
dc.creator.none.fl_str_mv Harazaki, Manami
Tekli, Joe
Yokoyama, Shohei
Fukuta, Naoki
Chbeir, Richard
Ishikawa, Hiroshi
dc.date.none.fl_str_mv 2012-08-01
2017-07-04T11:23:06Z
2017-07-04T11:23:06Z
dc.identifier.none.fl_str_mv 9783642288067
http://hdl.handle.net/10725/5869
http://dx.doi.org/10.1007/978-3-642-28807-4_63
Harazaki, M., Tekli, J., Yokoyama, S., Fukuta, N., Chbeir, R., & Ishikawa, H. (2013). XBeGene: scalable XML documents generator by example based on real data. In Recent Progress in Data Engineering and Internet Technology: Volume 1 (pp. 449-460). Springer Berlin Heidelberg.
http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php
https://link.springer.com/chapter/10.1007/978-3-642-28807-4_63
dc.language.none.fl_str_mv en
dc.publisher.none.fl_str_mv Springer
dc.relation.none.fl_str_mv Lecture Notes in Electrical Engineering
156
dc.rights.*.fl_str_mv info:eu-repo/semantics/openAccess
dc.title.none.fl_str_mv XBeGene: Scalable XML Documents Generator by Example Based on Real Data
dc.type.none.fl_str_mv Conference Paper / Proceeding
info:eu-repo/semantics/publishedVersion
info:eu-repo/semantics/conferenceObject
description XML datasets of various sizes and properties are needed to evaluate the correctness and efficiency of XML-based algorithms and applications. While several downloadable datasets can be found online, these are predefined by system experts and might not be suitable to evaluate every algorithm. Tools for generating synthetic XML documents underline an alternative solution, promoting flexibility and adaptability in generating synthetic document collections. Nonetheless, the usefulness of existing XML generators remains rather limited due to the restricted levels of expressiveness allowed to users. In this paper, we develop a novel XML By example Generator (XBeGene) for producing synthetic XML data which closely reflect the user's requirements. Inspired by the query-by-example paradigm in information retrieval, Our generator system i)allows the user to provide her own sample XML documents as input, ii) analyzes the structure, occurrence frequencies, and content distributions for each XML element in the user input documents, and iii) produces synthetic XML documents which closely concur, in both structural and content features, to the user's input data. The size of each synthetic document as well as that of the entire document collection are also specified by the user. Clustering experiments demonstrate high correlation levels between the specified user requirements and the characteristics of the generated XML data, while timing results confirm our approach's scalability to large scale document collections.
eu_rights_str_mv openAccess
format conferenceObject
id LAURepo_ced05b120d20ca2ae2058b92a80f3ac8
identifier_str_mv 9783642288067
Harazaki, M., Tekli, J., Yokoyama, S., Fukuta, N., Chbeir, R., & Ishikawa, H. (2013). XBeGene: scalable XML documents generator by example based on real data. In Recent Progress in Data Engineering and Internet Technology: Volume 1 (pp. 449-460). Springer Berlin Heidelberg.
language_invalid_str_mv en
network_acronym_str LAURepo
network_name_str Lebanese American University repository
oai_identifier_str oai:laur.lau.edu.lb:10725/5869
publishDate 2012
publisher.none.fl_str_mv Springer
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
spelling XBeGene: Scalable XML Documents Generator by Example Based on Real DataHarazaki, ManamiTekli, JoeYokoyama, ShoheiFukuta, NaokiChbeir, RichardIshikawa, HiroshiXML datasets of various sizes and properties are needed to evaluate the correctness and efficiency of XML-based algorithms and applications. While several downloadable datasets can be found online, these are predefined by system experts and might not be suitable to evaluate every algorithm. Tools for generating synthetic XML documents underline an alternative solution, promoting flexibility and adaptability in generating synthetic document collections. Nonetheless, the usefulness of existing XML generators remains rather limited due to the restricted levels of expressiveness allowed to users. In this paper, we develop a novel XML By example Generator (XBeGene) for producing synthetic XML data which closely reflect the user's requirements. Inspired by the query-by-example paradigm in information retrieval, Our generator system i)allows the user to provide her own sample XML documents as input, ii) analyzes the structure, occurrence frequencies, and content distributions for each XML element in the user input documents, and iii) produces synthetic XML documents which closely concur, in both structural and content features, to the user's input data. The size of each synthetic document as well as that of the entire document collection are also specified by the user. Clustering experiments demonstrate high correlation levels between the specified user requirements and the characteristics of the generated XML data, while timing results confirm our approach's scalability to large scale document collections.N/ASpringerGaol, Ford Lumban2017-07-04T11:23:06Z2017-07-04T11:23:06Z2012-08-01Conference Paper / Proceedinginfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObject9783642288067http://hdl.handle.net/10725/5869http://dx.doi.org/10.1007/978-3-642-28807-4_63Harazaki, M., Tekli, J., Yokoyama, S., Fukuta, N., Chbeir, R., & Ishikawa, H. (2013). XBeGene: scalable XML documents generator by example based on real data. In Recent Progress in Data Engineering and Internet Technology: Volume 1 (pp. 449-460). Springer Berlin Heidelberg.http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.phphttps://link.springer.com/chapter/10.1007/978-3-642-28807-4_63enLecture Notes in Electrical Engineering156info:eu-repo/semantics/openAccessoai:laur.lau.edu.lb:10725/58692025-03-21T14:46:46Z
spellingShingle XBeGene: Scalable XML Documents Generator by Example Based on Real Data
Harazaki, Manami
status_str publishedVersion
title XBeGene: Scalable XML Documents Generator by Example Based on Real Data
title_full XBeGene: Scalable XML Documents Generator by Example Based on Real Data
title_fullStr XBeGene: Scalable XML Documents Generator by Example Based on Real Data
title_full_unstemmed XBeGene: Scalable XML Documents Generator by Example Based on Real Data
title_short XBeGene: Scalable XML Documents Generator by Example Based on Real Data
title_sort XBeGene: Scalable XML Documents Generator by Example Based on Real Data
url http://hdl.handle.net/10725/5869
http://dx.doi.org/10.1007/978-3-642-28807-4_63
http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php
https://link.springer.com/chapter/10.1007/978-3-642-28807-4_63