Apache Flink Data Processing Framework

From GM-RKB

Revision as of 02:42, 11 April 2015 by Gmelli (talk | contribs) (Created page with "An Apache Flink Data Processing Framework is a Data Processing Framework managed by an Apache Flink Project. * <B>Context:</B> ** ... * <B>Counter-Example(s):</B>...")

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Jump to navigation Jump to search

An Apache Flink Data Processing Framework is a Data Processing Framework managed by an Apache Flink Project.

Context:
- ...
Counter-Example(s):
- Apache Spark.
- Hadoop MapReduce.
See: HPC.

2015

https://flink.apache.org/
- Fast: State-of-the art performance exploiting in-memory processing and data streaming.
- Reliable: Flink is designed to perform very well even when the cluster's memory runs out.
- Expressive: Write beautiful, type-safe code in Java and Scala. Execute it on a cluster.
- Easy to use: Few configuration parameters required. Cost-based optimizer built in.
- Scalable: Tested on clusters of 100s of machines, Google Compute Engine, and Amazon EC2.
- Hadoop-compatible: Flink runs on YARN and HDFS and has a Hadoop compatibility package.

http://ci.apache.org/projects/flink/flink-docs-release-0.8.1/faq.html#is-flink-a-hadoop-project
- Flink is a data processing system and an alternative to Hadoop’s MapReduce component. It comes with its own runtime, rather than building on top of MapReduce. As such, it can work completely independently of the Hadoop ecosystem. However, Flink can also access Hadoop’s distributed file system (HDFS) to read and write data, and Hadoop’s next-generation resource manager (YARN) to provision cluster resources. Since most Flink users are using Hadoop HDFS to store their data, we ship already the required libraries to access HDFS.

Retrieved from "http://www.gabormelli.com/RKB/index.php?title=Apache_Flink_Data_Processing_Framework&oldid=170122"