Hadoop

From Wikipedia, the free encyclopedia

Lucene Hadoop

Developer:	Apache Software Foundation
Latest release:	0.12.1 / March 17, 2007
OS:	Cross-platform
Use:	Search Engine
License:	Apache 2.0 Licence
Website:	lucene.apache.org/hadoop/

Hadoop is a Free Java software framework that supports distributed applications running on large clusters of commodity computers that process huge amounts of data. It is an Apache Lucene sub-project and was originally developed to support distribution for Nutch.^[1] Hadoop consists of a distributed filesystem reminiscent of GoogleFS named the "Hadoop Distributed File System" (HDFS) and a MapReduce implementation.^[2]

Hadoop was named after its creator's child's stuffed elephant.

[edit] References

^ "Hadoop is a Lucene sub-project that contains the distributed computing platform that was formerly a part of Nutch. This includes the Hadoop Distributed Filesystem (HDFS) and an implementation of map/reduce." About Hadoop
^ "Hadoop is a framework for running applications on large clusters of commodity hardware. The Hadoop framework transparently provides applications both reliability and data motion. Hadoop implements a computational paradigm named map/reduce, where the application is divided into many small fragments of work, each of which may be executed or reexecuted on any node in the cluster. In addition, it provides a distributed file system that stores data on the compute nodes, providing very high aggregate bandwidth across the cluster. Both map/reduce and the distributed file system are designed so that node failures are automatically handled by the framework." About Hadoop

[edit] External links

Hadoop website
- Hadoop wiki
- Hadoop Distributed File System requirements
Mention of Nutch and Hadoop in an article about Google
IBM MapReduce Tools for Eclipse

v • d • e Apache Software Foundation
Top level Projects	Apache HTTP Server • ActiveMQ • Ant • APR • Beehive • Cayenne • Cocoon • Directory • Excalibur • Forrest • Geronimo • Gump • iBATIS • Jackrabbit • James • Lenya • Maven • Mina • MyFaces • OFBiz • mod_perl • SpamAssassin • Struts • Tcl • Tomcat • Axis • Axis2 • WSIF • XMLBeans • Tapestry • HiveMind • WebWork 2 • Harmony • Velocity • Santuario • Shale
Apache Jakarta Project	BCEL • BSF • Cactus • Commons • ECS • HttpComponents • JCS • JMeter • ORO • POI • Regexp • Slide • Taglibs • Turbine
Apache DB	Derby • Torque • DdlUtils • OJB • JDO
Apache Portals	Jetspeed 1 • Jetspeed 2 • Graffito • Pluto • WSRP4J
Apache Lucene	Lucene Java • Nutch • Hadoop • Lucene4c • Lucy
Apache XML	AxKit • Xalan • Xerces
XML Graphics	Batik • FOP
Apache Logging	Log4j • Log4Cxx • Log4Perl • Log4PLSQL
Apache Incubator	XAP • River • OpenEJB • OpenJPA • ServiceMix • Wicket • Graffito • Tuscany • Log4Net • Roller • Felix • Abdera • CeltiXfire • FtpServer • Heraldry • Ivy • JuiCE • Kabuki • Lokahi • Lucene.Net • mod_ftp • NMaven • Ode • stdcxx • Woden • WSRP4J • Yoko • Log4PHP • WADI • Qpid • stdcxx • TripleSoup • UIMA • wadi
License: Apache License • Website: apache.org