XBeGene: Scalable XML Documents Generator by Example Based on Real Data
XML datasets of various sizes and properties are needed to evaluate the correctness and efficiency of XML-based algorithms and applications. While several downloadable datasets can be found online, these are predefined by system experts and might not be suitable to evaluate every algorithm. Tools fo...
Saved in:
| Main Author: | |
|---|---|
| Other Authors: | , , , , |
| Format: | conferenceObject |
| Published: |
2012
|
| Online Access: | http://hdl.handle.net/10725/5869 http://dx.doi.org/10.1007/978-3-642-28807-4_63 http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php https://link.springer.com/chapter/10.1007/978-3-642-28807-4_63 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1864513478164742144 |
|---|---|
| author | Harazaki, Manami |
| author2 | Tekli, Joe Yokoyama, Shohei Fukuta, Naoki Chbeir, Richard Ishikawa, Hiroshi |
| author2_role | author author author author author |
| author_facet | Harazaki, Manami Tekli, Joe Yokoyama, Shohei Fukuta, Naoki Chbeir, Richard Ishikawa, Hiroshi |
| author_role | author |
| dc.contributor.none.fl_str_mv | Gaol, Ford Lumban |
| dc.creator.none.fl_str_mv | Harazaki, Manami Tekli, Joe Yokoyama, Shohei Fukuta, Naoki Chbeir, Richard Ishikawa, Hiroshi |
| dc.date.none.fl_str_mv | 2012-08-01 2017-07-04T11:23:06Z 2017-07-04T11:23:06Z |
| dc.identifier.none.fl_str_mv | 9783642288067 http://hdl.handle.net/10725/5869 http://dx.doi.org/10.1007/978-3-642-28807-4_63 Harazaki, M., Tekli, J., Yokoyama, S., Fukuta, N., Chbeir, R., & Ishikawa, H. (2013). XBeGene: scalable XML documents generator by example based on real data. In Recent Progress in Data Engineering and Internet Technology: Volume 1 (pp. 449-460). Springer Berlin Heidelberg. http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php https://link.springer.com/chapter/10.1007/978-3-642-28807-4_63 |
| dc.language.none.fl_str_mv | en |
| dc.publisher.none.fl_str_mv | Springer |
| dc.relation.none.fl_str_mv | Lecture Notes in Electrical Engineering 156 |
| dc.rights.*.fl_str_mv | info:eu-repo/semantics/openAccess |
| dc.title.none.fl_str_mv | XBeGene: Scalable XML Documents Generator by Example Based on Real Data |
| dc.type.none.fl_str_mv | Conference Paper / Proceeding info:eu-repo/semantics/publishedVersion info:eu-repo/semantics/conferenceObject |
| description | XML datasets of various sizes and properties are needed to evaluate the correctness and efficiency of XML-based algorithms and applications. While several downloadable datasets can be found online, these are predefined by system experts and might not be suitable to evaluate every algorithm. Tools for generating synthetic XML documents underline an alternative solution, promoting flexibility and adaptability in generating synthetic document collections. Nonetheless, the usefulness of existing XML generators remains rather limited due to the restricted levels of expressiveness allowed to users. In this paper, we develop a novel XML By example Generator (XBeGene) for producing synthetic XML data which closely reflect the user's requirements. Inspired by the query-by-example paradigm in information retrieval, Our generator system i)allows the user to provide her own sample XML documents as input, ii) analyzes the structure, occurrence frequencies, and content distributions for each XML element in the user input documents, and iii) produces synthetic XML documents which closely concur, in both structural and content features, to the user's input data. The size of each synthetic document as well as that of the entire document collection are also specified by the user. Clustering experiments demonstrate high correlation levels between the specified user requirements and the characteristics of the generated XML data, while timing results confirm our approach's scalability to large scale document collections. |
| eu_rights_str_mv | openAccess |
| format | conferenceObject |
| id | LAURepo_ced05b120d20ca2ae2058b92a80f3ac8 |
| identifier_str_mv | 9783642288067 Harazaki, M., Tekli, J., Yokoyama, S., Fukuta, N., Chbeir, R., & Ishikawa, H. (2013). XBeGene: scalable XML documents generator by example based on real data. In Recent Progress in Data Engineering and Internet Technology: Volume 1 (pp. 449-460). Springer Berlin Heidelberg. |
| language_invalid_str_mv | en |
| network_acronym_str | LAURepo |
| network_name_str | Lebanese American University repository |
| oai_identifier_str | oai:laur.lau.edu.lb:10725/5869 |
| publishDate | 2012 |
| publisher.none.fl_str_mv | Springer |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| spelling | XBeGene: Scalable XML Documents Generator by Example Based on Real DataHarazaki, ManamiTekli, JoeYokoyama, ShoheiFukuta, NaokiChbeir, RichardIshikawa, HiroshiXML datasets of various sizes and properties are needed to evaluate the correctness and efficiency of XML-based algorithms and applications. While several downloadable datasets can be found online, these are predefined by system experts and might not be suitable to evaluate every algorithm. Tools for generating synthetic XML documents underline an alternative solution, promoting flexibility and adaptability in generating synthetic document collections. Nonetheless, the usefulness of existing XML generators remains rather limited due to the restricted levels of expressiveness allowed to users. In this paper, we develop a novel XML By example Generator (XBeGene) for producing synthetic XML data which closely reflect the user's requirements. Inspired by the query-by-example paradigm in information retrieval, Our generator system i)allows the user to provide her own sample XML documents as input, ii) analyzes the structure, occurrence frequencies, and content distributions for each XML element in the user input documents, and iii) produces synthetic XML documents which closely concur, in both structural and content features, to the user's input data. The size of each synthetic document as well as that of the entire document collection are also specified by the user. Clustering experiments demonstrate high correlation levels between the specified user requirements and the characteristics of the generated XML data, while timing results confirm our approach's scalability to large scale document collections.N/ASpringerGaol, Ford Lumban2017-07-04T11:23:06Z2017-07-04T11:23:06Z2012-08-01Conference Paper / Proceedinginfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObject9783642288067http://hdl.handle.net/10725/5869http://dx.doi.org/10.1007/978-3-642-28807-4_63Harazaki, M., Tekli, J., Yokoyama, S., Fukuta, N., Chbeir, R., & Ishikawa, H. (2013). XBeGene: scalable XML documents generator by example based on real data. In Recent Progress in Data Engineering and Internet Technology: Volume 1 (pp. 449-460). Springer Berlin Heidelberg.http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.phphttps://link.springer.com/chapter/10.1007/978-3-642-28807-4_63enLecture Notes in Electrical Engineering156info:eu-repo/semantics/openAccessoai:laur.lau.edu.lb:10725/58692025-03-21T14:46:46Z |
| spellingShingle | XBeGene: Scalable XML Documents Generator by Example Based on Real Data Harazaki, Manami |
| status_str | publishedVersion |
| title | XBeGene: Scalable XML Documents Generator by Example Based on Real Data |
| title_full | XBeGene: Scalable XML Documents Generator by Example Based on Real Data |
| title_fullStr | XBeGene: Scalable XML Documents Generator by Example Based on Real Data |
| title_full_unstemmed | XBeGene: Scalable XML Documents Generator by Example Based on Real Data |
| title_short | XBeGene: Scalable XML Documents Generator by Example Based on Real Data |
| title_sort | XBeGene: Scalable XML Documents Generator by Example Based on Real Data |
| url | http://hdl.handle.net/10725/5869 http://dx.doi.org/10.1007/978-3-642-28807-4_63 http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php https://link.springer.com/chapter/10.1007/978-3-642-28807-4_63 |