Data Analytics Platform

From GM-RKB
(Redirected from data analytics platform)
Jump to navigation Jump to search

A Data Analytics Platform is a data-processing platform that can be used to implement a data analytics system.



References

2016

  • http://lintool.github.io/bigdata-2016w/
    • QUOTE: This course provides an introduction to big data infrastructure for analytics. The focus is algorithm design and "thinking at scale": we will cover data mining and machine learning techniques as applied to text, graphs, and relational data. Most of the course will be taught in a combination of MapReduce and Spark, two representative dataflow abstractions for large-scale data analysis, although we will introduce alternative abstractions such as bulk-synchronous parallel and streaming models as well.

      One might break down the "big data" stack in the manner shown on the right. At the bottom resides the execution infrastructure, which is responsible for coordinating computations across a cluster (examples include MapReduce and Spark). In the middle resides analytics infrastructure, which implements data mining and machine learning algorithms on top of the execution infrastructure (an example would be MLlib in Spark). At the top are the tools data scientists use to generate insights, built on top of the analytics infrastructure. This course focuses on the middle part — by the end of the course, you will be able to implement basic data mining and machine learning algorithms that can operate at scale. Of course, effective algorithm design requires understanding the execution infrastructure (below) and what the algorithms are used for (above), so we will cover the broader context as well.