AWS EMR-based Cluster

From GM-RKB
(Redirected from AWS EMR Cluster)
Jump to navigation Jump to search

An AWS EMR-based Cluster is a computing cluster that is based on AWS' EMR service.



References

2018

2015

  • http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-nodes.html
    • QUOTE: Amazon EMR defines three roles for the servers in a cluster. These different roles are referred to as node types. The Amazon EMR node types map to the master and slave roles defined in Hadoop.
      • Master node — Manages the cluster: coordinating the distribution of the MapReduce executable and subsets of the raw data, to the core and task instance groups. It also tracks the status of each task performed, and monitors the health of the instance groups. There is only one master node in a cluster. This maps to the Hadoop master node.
      • Core nodes — Runs tasks and stores data using the Hadoop Distributed File System (HDFS). This maps to a Hadoop slave node.
      • Task nodes (optional) — Run tasks. This maps to a Hadoop slave node.