Apache Flink Data Processing Framework

Context:
- It can have integration to the Kafka Framework.
- It can support Stateful Aggregations.
- It can be Multi-Language Framework: Java, ...
- …
Example(s):
- Apache Flink, v1.13.1 (2021-05-28) [1]
- Apache Flink, v1.6.4 (2019-02-25) [2]
- Apache Flink, v1.3.0 (2017-06-01) [3]
- Apache Flink, v1.0.0 (2016-03-08).
- …
Counter-Example(s):
See: HPC, Apache Hadoop MapReduce, AWS Managed Flink, Exactly-Once Processing.

References

(Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/Apache_Flink Retrieved:2017-4-25.
- Apache Flink is an open source stream processing framework developed by the Apache Software Foundation. The core of Apache Flink is a distributed streaming dataflow engine written in Java and Scala. Flink executes arbitrary dataflow programs in a data-parallel and pipelined manner. ^[1] Flink's pipelined runtime system enables the execution of bulk/batch and stream processing programs. Furthermore, Flink's runtime supports the execution of iterative algorithms natively. ^[2] Flink provides a high-throughput, low-latency streaming engine as well as support for event-time processing and state management. Flink applications are fault-tolerant in the event of machine failure and support exactly-once semantics. Programs can be written in Java, Scala, Python, and SQL and are automatically compiled and optimized ^[3] into dataflow programs that are executed in a cluster or cloud environment. ^[4] Flink does not provide its own data storage system and provides data source and sink connectors to systems such as Amazon Kinesis, Apache Kafka, HDFS, Apache Cassandra, and ElasticSearch.

http://www.developintelligence.com/blog/2017/02/comparing-contrasting-apache-flink-vs-spark/
- QUOTE: … At first glance, Flink and Spark would appear to be the same. The main difference is Flink was built from the ground up as a streaming product. Spark added Streaming onto their product later. …

↑ Alexander Alexandrov, Rico Bergmann, Stephan Ewen, Johann-Christoph Freytag, Fabian Hueske, Arvid Heise, Odej Kao, Marcus Leich, Ulf Leser, Volker Markl, Felix Naumann, Mathias Peters, Astrid Rheinländer, Matthias J. Sax, Sebastian Schelter, Mareike Höger, Kostas Tzoumas, and Daniel Warneke. 2014. The Stratosphere platform for big data analytics. The VLDB Journal 23, 6 (December 2014), 939-964. DOI
↑ Stephan Ewen, Kostas Tzoumas, Moritz Kaufmann, and Volker Markl. 2012. Spinning fast iterative data flows. Proc. VLDB Endow. 5, 11 (July 2012), 1268-1279. DOI
↑ Fabian Hueske, Mathias Peters, Matthias J. Sax, Astrid Rheinländer, Rico Bergmann, Aljoscha Krettek, and Kostas Tzoumas. 2012. Opening the black boxes in data flow optimization. Proc. VLDB Endow. 5, 11 (July 2012), 1256-1267. DOI
↑ Daniel Warneke and Odej Kao. 2009. Nephele: efficient parallel data processing in the cloud. In: Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS '09). ACM, New York, NY, USA, , Article 8 , 10 pages. DOI