Big Stream Processing Systems: An Experimental Evaluation

As the world gets more instrumented and connected, we are witnessing a flood of digital data generated from various hardware (e.g., sensors) or software in the format of flowing streams of data. Real-time processing for such massive amounts of streaming data is a crucial requirement in several appli...

Full description

Saved in:
Bibliographic Details
Main Author: Shahverdi, Elkhan (author)
Other Authors: Awad, Ahmed (author), Sakr, Sherif (author)
Published: 2019
Online Access:https://bspace.buid.ac.ae/handle/1234/2922
https://ieeexplore.ieee.org/document/8750955
https://doi.org/10.1109/ICDEW.2019.00-35
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:As the world gets more instrumented and connected, we are witnessing a flood of digital data generated from various hardware (e.g., sensors) or software in the format of flowing streams of data. Real-time processing for such massive amounts of streaming data is a crucial requirement in several application domains including financial markets, surveillance systems, man ufacturing, smart cities, and scalable monitoring infrastructure. In the last few years, several big stream processing engines have been introduced to tackle this challenge. In this article, we present an extensive experimental study of five popular systems in this domain, namely, Apache Storm, Apache Flink, Apache Spark, Kafka Streams and Hazelcast Jet. We report and analyze the performance characteristics of these systems. In addition, we report a set of insights and important lessons that we have learned from conducting our experiments.