Hadoop

From Wikipedia, the free encyclopedia

Lucene Hadoop
Lucene Hadoop Logo
Developer: Apache Software Foundation
Latest release: 0.12.1 / March 17, 2007
OS: Cross-platform
Use: Search Engine
License: Apache 2.0 Licence
Website: lucene.apache.org/hadoop/

Hadoop is a Free Java software framework that supports distributed applications running on large clusters of commodity computers that process huge amounts of data. It is an Apache Lucene sub-project and was originally developed to support distribution for Nutch.[1] Hadoop consists of a distributed filesystem reminiscent of GoogleFS named the "Hadoop Distributed File System" (HDFS) and a MapReduce implementation.[2]

Hadoop was named after its creator's child's stuffed elephant.

[edit] References

  1. ^ "Hadoop is a Lucene sub-project that contains the distributed computing platform that was formerly a part of Nutch. This includes the Hadoop Distributed Filesystem (HDFS) and an implementation of map/reduce." About Hadoop
  2. ^ "Hadoop is a framework for running applications on large clusters of commodity hardware. The Hadoop framework transparently provides applications both reliability and data motion. Hadoop implements a computational paradigm named map/reduce, where the application is divided into many small fragments of work, each of which may be executed or reexecuted on any node in the cluster. In addition, it provides a distributed file system that stores data on the compute nodes, providing very high aggregate bandwidth across the cluster. Both map/reduce and the distributed file system are designed so that node failures are automatically handled by the framework." About Hadoop

[edit] External links


In other languages