Skip to main content

Elastic Map Reduce (EMR)

  • Used to create Big Data clusters to analyze and process vast amounts of data
  • Uses Hadoop, an open-source framework, to distribute your data and processing across a cluster of 100s of EC2 instances.
  • Supports open-source tools such as Apache Spark, HBase, Presto, Flink, etc.
  • EMR takes care of all the provisioning and configuration
  • Auto-scaling
  • Integrated with Spot Instances
  • Can be used to process large amounts of log files