Adaptive Watermarks: A Concept Drift-based Approach for Predicting Event-Time Progress in Data Streams

Event-time based stream processing is concerned with analyzing data with respect to its generation time. In most of the cases, data gets delayed during its journey from the source(s) to the stream processing engine. This is known as late data arrival. Among the different approaches for out-of-order...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Awad, Ahmed (author)
مؤلفون آخرون: Traub, Jonas (author), Sakr, Sherif (author)
منشور في: 2019
الوصول للمادة أونلاين:https://bspace.buid.ac.ae/handle/1234/2924
https://openproceedings.org/2019/conf/edbt/EDBT19_paper_211.pdf
https://doi.org/10.5441/002/edbt.2019.71
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1862980613181538304
author Awad, Ahmed
author2 Traub, Jonas
Sakr, Sherif
author2_role author
author
author_facet Awad, Ahmed
Traub, Jonas
Sakr, Sherif
author_role author
dc.creator.none.fl_str_mv Awad, Ahmed
Traub, Jonas
Sakr, Sherif
dc.date.none.fl_str_mv 2019
2025-05-06T08:21:33Z
2025-05-06T08:21:33Z
dc.identifier.none.fl_str_mv Awad A. et al. (2019) “Adaptive watermarks: A concept drift-based approach for predicting event-time progress in data streams,” Advances in Database Technology - EDBT, 2019-March, pp. 622–625.
2367-2005
https://bspace.buid.ac.ae/handle/1234/2924
https://openproceedings.org/2019/conf/edbt/EDBT19_paper_211.pdf
https://doi.org/10.5441/002/edbt.2019.71
dc.language.none.fl_str_mv en
dc.publisher.none.fl_str_mv Open proceedings org
dc.relation.none.fl_str_mv Advances in Database Technology
dc.title.none.fl_str_mv Adaptive Watermarks: A Concept Drift-based Approach for Predicting Event-Time Progress in Data Streams
dc.type.none.fl_str_mv Conference Paper
description Event-time based stream processing is concerned with analyzing data with respect to its generation time. In most of the cases, data gets delayed during its journey from the source(s) to the stream processing engine. This is known as late data arrival. Among the different approaches for out-of-order stream processing, low watermarks are proposed to inject special records within data streams, i.e., watermarks. A watermark is a timestamp which indicates that no data with a timestamp older than the water mark should be observed later on. Any element as such is consid ered a late arrival. Watermark generation is usually periodic and heuristic-based. The limitation of such watermark generation strategy is its rigidness regarding the frequency of data arrival as well as the delay that data may encounter. In this paper, we propose an adaptive watermark generation strategy. Our strat egy decides adaptively when to generate watermarks and with what timestamp without a priori adjustment. We treat changes in data arrival frequency and changes in delays as concept drifts in stream data mining. We use an Adaptive Window (ADWIN) as our concept drift sensor for the change in the distribution of arrival rate and delay. We have implemented our approach on top of Apache Flink. We compare our approach with periodic water mark generation using two real-life data sets. Our results show that adaptive watermarks achieve a lower average latency by triggering windows earlier and a lower rate of dropped elements by delaying watermarks when out-of-order data is expected.
id budr_c9e9ab8e5f20250c9cf47cded35c6cd1
identifier_str_mv Awad A. et al. (2019) “Adaptive watermarks: A concept drift-based approach for predicting event-time progress in data streams,” Advances in Database Technology - EDBT, 2019-March, pp. 622–625.
2367-2005
language_invalid_str_mv en
network_acronym_str budr
network_name_str The British University in Dubai repository
oai_identifier_str oai:bspace.buid.ac.ae:1234/2924
publishDate 2019
publisher.none.fl_str_mv Open proceedings org
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
spelling Adaptive Watermarks: A Concept Drift-based Approach for Predicting Event-Time Progress in Data StreamsAwad, AhmedTraub, JonasSakr, SherifEvent-time based stream processing is concerned with analyzing data with respect to its generation time. In most of the cases, data gets delayed during its journey from the source(s) to the stream processing engine. This is known as late data arrival. Among the different approaches for out-of-order stream processing, low watermarks are proposed to inject special records within data streams, i.e., watermarks. A watermark is a timestamp which indicates that no data with a timestamp older than the water mark should be observed later on. Any element as such is consid ered a late arrival. Watermark generation is usually periodic and heuristic-based. The limitation of such watermark generation strategy is its rigidness regarding the frequency of data arrival as well as the delay that data may encounter. In this paper, we propose an adaptive watermark generation strategy. Our strat egy decides adaptively when to generate watermarks and with what timestamp without a priori adjustment. We treat changes in data arrival frequency and changes in delays as concept drifts in stream data mining. We use an Adaptive Window (ADWIN) as our concept drift sensor for the change in the distribution of arrival rate and delay. We have implemented our approach on top of Apache Flink. We compare our approach with periodic water mark generation using two real-life data sets. Our results show that adaptive watermarks achieve a lower average latency by triggering windows earlier and a lower rate of dropped elements by delaying watermarks when out-of-order data is expected.Open proceedings org2025-05-06T08:21:33Z2025-05-06T08:21:33Z2019Conference PaperAwad A. et al. (2019) “Adaptive watermarks: A concept drift-based approach for predicting event-time progress in data streams,” Advances in Database Technology - EDBT, 2019-March, pp. 622–625.2367-2005https://bspace.buid.ac.ae/handle/1234/2924https://openproceedings.org/2019/conf/edbt/EDBT19_paper_211.pdfhttps://doi.org/10.5441/002/edbt.2019.71enAdvances in Database Technology oai:bspace.buid.ac.ae:1234/29242025-06-13T11:27:55Z
spellingShingle Adaptive Watermarks: A Concept Drift-based Approach for Predicting Event-Time Progress in Data Streams
Awad, Ahmed
title Adaptive Watermarks: A Concept Drift-based Approach for Predicting Event-Time Progress in Data Streams
title_full Adaptive Watermarks: A Concept Drift-based Approach for Predicting Event-Time Progress in Data Streams
title_fullStr Adaptive Watermarks: A Concept Drift-based Approach for Predicting Event-Time Progress in Data Streams
title_full_unstemmed Adaptive Watermarks: A Concept Drift-based Approach for Predicting Event-Time Progress in Data Streams
title_short Adaptive Watermarks: A Concept Drift-based Approach for Predicting Event-Time Progress in Data Streams
title_sort Adaptive Watermarks: A Concept Drift-based Approach for Predicting Event-Time Progress in Data Streams
url https://bspace.buid.ac.ae/handle/1234/2924
https://openproceedings.org/2019/conf/edbt/EDBT19_paper_211.pdf
https://doi.org/10.5441/002/edbt.2019.71