2011 NIMBLEaToolkitfortheImplementat

From GM-RKB

Jump to navigation Jump to search

(Ghoting et al., 2011) ⇒ Amol Ghoting, Prabhanjan Kambadur, Edwin Pednault, and Ramakrishnan Kannan. (2011). “NIMBLE: A Toolkit for the Implementation of Parallel Data Mining and Machine Learning Algorithms on Mapreduce.” In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2011) Journal. ISBN:978-1-4503-0813-7 doi:10.1145/2020408.2020464

Subject Headings:

Notes

Cited By

Quotes

Author Keywords

Algorithms; data mining; design; experimentation; machine learning; map/reduce; parallelism; performance; reliability; software libraries; software libraries

Abstract

In the last decade, advances in data collection and storage technologies have led to an increased interest in designing and implementing large-scale parallel algorithms for machine learning and data mining (ML-DM). Existing programming paradigms for expressing large-scale parallelism such as MapReduce (MR) and the Message Passing Interface (MPI) have been the de facto choices for implementing these ML-DM algorithms. The MR programming paradigm has been of particular interest as it gracefully handles large datasets and has built-in resilience against failures. However, the existing parallel programming paradigms are too low-level and ill-suited for implementing ML-DM algorithms. To address this deficiency, we present NIMBLE, a portable infrastructure that has been specifically designed to enable the rapid implementation of parallel ML-DM algorithms. The infrastructure allows one to compose parallel ML-DM algorithms using reusable (serial and parallel) building blocks that can be efficiently executed using MR and other parallel programming models; it currently runs on top of Hadoop, which is an open-source MR implementation. We show how NIMBLE can be used to realize scalable implementations of ML-DM algorithms and present a performance evaluation.

References

;

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2011 NIMBLEaToolkitfortheImplementat	Amol Ghoting Prabhanjan Kambadur Edwin Pednault Ramakrishnan Kannan			NIMBLE: A Toolkit for the Implementation of Parallel Data Mining and Machine Learning Algorithms on Mapreduce				10.1145/2020408.2020464		2011

Retrieved from "http://www.gabormelli.com/RKB/index.php?title=2011_NIMBLEaToolkitfortheImplementat&oldid=844898"

Facts

... more about "2011 NIMBLEaToolkitfortheImplementat"

Amol Ghoting +, Prabhanjan Kambadur +, Edwin Pednault + and Ramakrishnan Kannan +

10.1145/2020408.2020464 +

Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining +

NIMBLE: A Toolkit for the Implementation of Parallel Data Mining and Machine Learning Algorithms on Mapreduce +

2011 +