Google Labs Publications: MapReduce

R.A. Hettinga rah at shipwright.com
Wed Dec 1 19:43:15 PST 2004


<http://labs.google.com/papers/mapreduce.html>
Google Labs Publication

MapReduce: Simplified Data Processing on Large Clusters
Jeffrey Dean and Sanjay Ghemawat
Google Inc.

Abstract

 MapReduce is a programming model and an associated implementation for
processing and generating large data sets. Users specify a map function
that processes a key/value pair to generate a set of intermediate key/value
pairs, and a reduce function that merges all intermediate values associated
with the same intermediate key. Many real world tasks are expressible in
this model, as shown in the paper.

 Programs written in this functional style are automatically parallelized
and executed on a large cluster of commodity machines. The run-time system
takes care of the details of partitioning the input data, scheduling the
program's execution across a set of machines, handling machine failures,
and managing the required inter-machine communication. This allows
programmers without any experience with parallel and distributed systems to
easily utilize the resources of a large distributed system.

 Our implementation of MapReduce runs on a large cluster of commodity
machines and is highly scalable: a typical MapReduce computation processes
many terabytes of data on thousands of machines. Programmers find the
system easy to use: hundreds of MapReduce programs have been implemented
and upwards of one thousand MapReduce jobs are executed on Google's
clusters every day.

 To appear in:
OSDI'04: Sixth Symposium on Operating System Design and Implementation,
 San Francisco, CA, December, 2004.

 Download: PDF Version
This material is presented to ensure timely dissemination of scholarly and
technical work. Copyright and all rights therein are retained by authors or
by other copyright holders. All person copying this information are
expected to adhere to the terms and constraints invoked by each author's
copyright. In most cases, these works may not be reposted without the
explicit permission of the copyright holder.
 Google Labs home page -  All About Google

 )2004 Google
-- 
-----------------
R. A. Hettinga <mailto: rah at ibuc.com>
The Internet Bearer Underwriting Corporation <http://www.ibuc.com/>
44 Farquhar Street, Boston, MA 02131 USA
"... however it may deserve respect for its usefulness and antiquity,
[predicting the end of the world] has not been found agreeable to
experience." -- Edward Gibbon, 'Decline and Fall of the Roman Empire'





More information about the cypherpunks-legacy mailing list