Online correlation for unlabeled process events: A flexible CEP-based approach

Process mining is a sub-field of data mining that focuses on analyzing timestamped and partially ordered data. This type of data is commonly called event logs. Each event is required to have at least three attributes: case ID, task ID/name, and timestamp to apply process mining techniques. Thus, any...

Full description

Saved in:
Bibliographic Details
Main Author: M.A. Helal, Iman (author)
Other Authors: Awad, Ahmed (author)
Published: 2022
Subjects:
Online Access:https://bspace.buid.ac.ae/handle/1234/2935
https://www.sciencedirect.com/science/article/pii/S0306437922000333?via%3Dihub
https://doi.org/10.1016/j.is.2022.102031
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1862980613473042432
author M.A. Helal, Iman
author2 Awad, Ahmed
author2_role author
author_facet M.A. Helal, Iman
Awad, Ahmed
author_role author
dc.creator.none.fl_str_mv M.A. Helal, Iman
Awad, Ahmed
dc.date.none.fl_str_mv 2022
2025-05-06T10:00:03Z
2025-05-06T10:00:03Z
dc.identifier.none.fl_str_mv Helal, I.M.A. and Awad, A. (2022) “Online correlation for unlabeled process events: A flexible CEP-based approach,” Information Systems, 108, p. 1.
0306-4379
https://bspace.buid.ac.ae/handle/1234/2935
https://www.sciencedirect.com/science/article/pii/S0306437922000333?via%3Dihub
https://doi.org/10.1016/j.is.2022.102031
dc.language.none.fl_str_mv en
dc.publisher.none.fl_str_mv ProQuest Central
dc.relation.none.fl_str_mv Information Systemsv108 (Sep 2022): 1
dc.subject.none.fl_str_mv Process mining Uncorrelated events Event streams Complex event processing
dc.title.none.fl_str_mv Online correlation for unlabeled process events: A flexible CEP-based approach
dc.type.none.fl_str_mv Article
description Process mining is a sub-field of data mining that focuses on analyzing timestamped and partially ordered data. This type of data is commonly called event logs. Each event is required to have at least three attributes: case ID, task ID/name, and timestamp to apply process mining techniques. Thus, any missing information need to be supplied first. Traditionally, events collected from different sources are manually correlated. While this might be acceptable in an offline setting, this is infeasible in an online setting. Recently, several use cases have emerged that call for applying process mining in an online setting. In such scenarios, a stream of high-speed and high-volume events continuously flow, e.g. IoT applications, with stringent latency requirements to have insights about the ongoing process. Thus, event correlation must be automated and occur as the data is being received. We introduce an approach that correlates unlabeled events received on a stream. Given a set of start activities, our approach correlates unlabeled events to a case identifier. Our approach is probabilistic. That implies a single uncorrelated event can be assigned to zero or more case identifiers with different probabilities. Moreover, our approach is flexible. That is, the user can supply domain knowledge in the form of constraints that reduce the correlation space. This knowledge can be supplied while the application is running. We realize our approach using complex event processing (CEP) technologies. We implemented a prototype on top of Esper, a state of the art industrial CEP engine. We compare our approach to baseline approaches. The experimental evaluation shows that our approach outperforms the throughput and latency of the baseline approaches. It also shows that using real-life logs, the accuracy of our approach can compete with the baseline approaches.
id budr_d04445daa9778a975e70156ae3b21492
identifier_str_mv Helal, I.M.A. and Awad, A. (2022) “Online correlation for unlabeled process events: A flexible CEP-based approach,” Information Systems, 108, p. 1.
0306-4379
language_invalid_str_mv en
network_acronym_str budr
network_name_str The British University in Dubai repository
oai_identifier_str oai:bspace.buid.ac.ae:1234/2935
publishDate 2022
publisher.none.fl_str_mv ProQuest Central
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
spelling Online correlation for unlabeled process events: A flexible CEP-based approachM.A. Helal, ImanAwad, AhmedProcess mining Uncorrelated events Event streams Complex event processingProcess mining is a sub-field of data mining that focuses on analyzing timestamped and partially ordered data. This type of data is commonly called event logs. Each event is required to have at least three attributes: case ID, task ID/name, and timestamp to apply process mining techniques. Thus, any missing information need to be supplied first. Traditionally, events collected from different sources are manually correlated. While this might be acceptable in an offline setting, this is infeasible in an online setting. Recently, several use cases have emerged that call for applying process mining in an online setting. In such scenarios, a stream of high-speed and high-volume events continuously flow, e.g. IoT applications, with stringent latency requirements to have insights about the ongoing process. Thus, event correlation must be automated and occur as the data is being received. We introduce an approach that correlates unlabeled events received on a stream. Given a set of start activities, our approach correlates unlabeled events to a case identifier. Our approach is probabilistic. That implies a single uncorrelated event can be assigned to zero or more case identifiers with different probabilities. Moreover, our approach is flexible. That is, the user can supply domain knowledge in the form of constraints that reduce the correlation space. This knowledge can be supplied while the application is running. We realize our approach using complex event processing (CEP) technologies. We implemented a prototype on top of Esper, a state of the art industrial CEP engine. We compare our approach to baseline approaches. The experimental evaluation shows that our approach outperforms the throughput and latency of the baseline approaches. It also shows that using real-life logs, the accuracy of our approach can compete with the baseline approaches.ProQuest Central2025-05-06T10:00:03Z2025-05-06T10:00:03Z2022ArticleHelal, I.M.A. and Awad, A. (2022) “Online correlation for unlabeled process events: A flexible CEP-based approach,” Information Systems, 108, p. 1.0306-4379https://bspace.buid.ac.ae/handle/1234/2935https://www.sciencedirect.com/science/article/pii/S0306437922000333?via%3Dihubhttps://doi.org/10.1016/j.is.2022.102031enInformation Systemsv108 (Sep 2022): 1oai:bspace.buid.ac.ae:1234/29352025-08-13T07:31:52Z
spellingShingle Online correlation for unlabeled process events: A flexible CEP-based approach
M.A. Helal, Iman
Process mining Uncorrelated events Event streams Complex event processing
title Online correlation for unlabeled process events: A flexible CEP-based approach
title_full Online correlation for unlabeled process events: A flexible CEP-based approach
title_fullStr Online correlation for unlabeled process events: A flexible CEP-based approach
title_full_unstemmed Online correlation for unlabeled process events: A flexible CEP-based approach
title_short Online correlation for unlabeled process events: A flexible CEP-based approach
title_sort Online correlation for unlabeled process events: A flexible CEP-based approach
topic Process mining Uncorrelated events Event streams Complex event processing
url https://bspace.buid.ac.ae/handle/1234/2935
https://www.sciencedirect.com/science/article/pii/S0306437922000333?via%3Dihub
https://doi.org/10.1016/j.is.2022.102031