Apache Flink Data Processing Framework
Jump to navigation
Jump to search
An Apache Flink Data Processing Framework is a Data Processing Framework managed by an Apache Flink Project.
- Context:
- ...
- Counter-Example(s):
- See: HPC.
2015
- https://flink.apache.org/
- Fast: State-of-the art performance exploiting in-memory processing and data streaming.
- Reliable: Flink is designed to perform very well even when the cluster's memory runs out.
- Expressive: Write beautiful, type-safe code in Java and Scala. Execute it on a cluster.
- Easy to use: Few configuration parameters required. Cost-based optimizer built in.
- Scalable: Tested on clusters of 100s of machines, Google Compute Engine, and Amazon EC2.
- Hadoop-compatible: Flink runs on YARN and HDFS and has a Hadoop compatibility package.
- http://ci.apache.org/projects/flink/flink-docs-release-0.8.1/faq.html#is-flink-a-hadoop-project
- Flink is a data processing system and an alternative to Hadoop’s MapReduce component. It comes with its own runtime, rather than building on top of MapReduce. As such, it can work completely independently of the Hadoop ecosystem. However, Flink can also access Hadoop’s distributed file system (HDFS) to read and write data, and Hadoop’s next-generation resource manager (YARN) to provision cluster resources. Since most Flink users are using Hadoop HDFS to store their data, we ship already the required libraries to access HDFS.