Unsupervised outlier detection in multidimensional data
<p>Detection and removal of outliers in a dataset is a fundamental preprocessing task without which the analysis of the data can be misleading. Furthermore, the existence of anomalies in the data can heavily degrade the performance of machine learning algorithms. In order to detect the anomali...
محفوظ في:
| المؤلف الرئيسي: | |
|---|---|
| مؤلفون آخرون: | |
| منشور في: |
2022
|
| الموضوعات: | |
| الوسوم: |
إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
|
| _version_ | 1864513566556553216 |
|---|---|
| author | Atiq ur Rehman (14153391) |
| author2 | Samir Brahim Belhaouari (9427347) |
| author2_role | author |
| author_facet | Atiq ur Rehman (14153391) Samir Brahim Belhaouari (9427347) |
| author_role | author |
| dc.creator.none.fl_str_mv | Atiq ur Rehman (14153391) Samir Brahim Belhaouari (9427347) |
| dc.date.none.fl_str_mv | 2022-11-22T21:18:25Z |
| dc.identifier.none.fl_str_mv | 10.1186/s40537-021-00469-z |
| dc.relation.none.fl_str_mv | https://figshare.com/articles/journal_contribution/Unsupervised_outlier_detection_in_multidimensional_data/21598509 |
| dc.rights.none.fl_str_mv | CC BY 4.0 info:eu-repo/semantics/openAccess |
| dc.subject.none.fl_str_mv | Commerce, management, tourism and services Business systems in context Information and computing sciences Distributed computing and systems software Information Systems and Management Computer Networks and Communications Hardware and Architecture Information Systems |
| dc.title.none.fl_str_mv | Unsupervised outlier detection in multidimensional data |
| dc.type.none.fl_str_mv | Text Journal contribution info:eu-repo/semantics/publishedVersion text contribution to journal |
| description | <p>Detection and removal of outliers in a dataset is a fundamental preprocessing task without which the analysis of the data can be misleading. Furthermore, the existence of anomalies in the data can heavily degrade the performance of machine learning algorithms. In order to detect the anomalies in a dataset in an unsupervised manner, some novel statistical techniques are proposed in this paper. The proposed techniques are based on statistical methods considering data compactness and other properties. The newly proposed ideas are found efficient in terms of performance, ease of implementation, and computational complexity. Furthermore, two proposed techniques presented in this paper use transformation of data to a unidimensional distance space to detect the outliers, so irrespective of the data’s high dimensions, the techniques remain computationally inexpensive and feasible. Comprehensive performance analysis of the proposed anomaly detection schemes is presented in the paper, and the newly proposed schemes are found better than the state-of-the-art methods when tested on several benchmark datasets.</p><h2>Other Information</h2> <p> Published in: Journal of Big Data<br> License: <a href="https://creativecommons.org/licenses/by/4.0" target="_blank">https://creativecommons.org/licenses/by/4.0</a><br>See article on publisher's website: <a href="http://dx.doi.org/10.1186/s40537-021-00469-z" target="_blank">http://dx.doi.org/10.1186/s40537-021-00469-z</a></p> |
| eu_rights_str_mv | openAccess |
| id | Manara2_886f2bf473bcdc48aa78b4eb8b1e1757 |
| identifier_str_mv | 10.1186/s40537-021-00469-z |
| network_acronym_str | Manara2 |
| network_name_str | Manara2 |
| oai_identifier_str | oai:figshare.com:article/21598509 |
| publishDate | 2022 |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| rights_invalid_str_mv | CC BY 4.0 |
| spelling | Unsupervised outlier detection in multidimensional dataAtiq ur Rehman (14153391)Samir Brahim Belhaouari (9427347)Commerce, management, tourism and servicesBusiness systems in contextInformation and computing sciencesDistributed computing and systems softwareInformation Systems and ManagementComputer Networks and CommunicationsHardware and ArchitectureInformation Systems<p>Detection and removal of outliers in a dataset is a fundamental preprocessing task without which the analysis of the data can be misleading. Furthermore, the existence of anomalies in the data can heavily degrade the performance of machine learning algorithms. In order to detect the anomalies in a dataset in an unsupervised manner, some novel statistical techniques are proposed in this paper. The proposed techniques are based on statistical methods considering data compactness and other properties. The newly proposed ideas are found efficient in terms of performance, ease of implementation, and computational complexity. Furthermore, two proposed techniques presented in this paper use transformation of data to a unidimensional distance space to detect the outliers, so irrespective of the data’s high dimensions, the techniques remain computationally inexpensive and feasible. Comprehensive performance analysis of the proposed anomaly detection schemes is presented in the paper, and the newly proposed schemes are found better than the state-of-the-art methods when tested on several benchmark datasets.</p><h2>Other Information</h2> <p> Published in: Journal of Big Data<br> License: <a href="https://creativecommons.org/licenses/by/4.0" target="_blank">https://creativecommons.org/licenses/by/4.0</a><br>See article on publisher's website: <a href="http://dx.doi.org/10.1186/s40537-021-00469-z" target="_blank">http://dx.doi.org/10.1186/s40537-021-00469-z</a></p>2022-11-22T21:18:25ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1186/s40537-021-00469-zhttps://figshare.com/articles/journal_contribution/Unsupervised_outlier_detection_in_multidimensional_data/21598509CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/215985092022-11-22T21:18:25Z |
| spellingShingle | Unsupervised outlier detection in multidimensional data Atiq ur Rehman (14153391) Commerce, management, tourism and services Business systems in context Information and computing sciences Distributed computing and systems software Information Systems and Management Computer Networks and Communications Hardware and Architecture Information Systems |
| status_str | publishedVersion |
| title | Unsupervised outlier detection in multidimensional data |
| title_full | Unsupervised outlier detection in multidimensional data |
| title_fullStr | Unsupervised outlier detection in multidimensional data |
| title_full_unstemmed | Unsupervised outlier detection in multidimensional data |
| title_short | Unsupervised outlier detection in multidimensional data |
| title_sort | Unsupervised outlier detection in multidimensional data |
| topic | Commerce, management, tourism and services Business systems in context Information and computing sciences Distributed computing and systems software Information Systems and Management Computer Networks and Communications Hardware and Architecture Information Systems |