Scikit-learn

From Wikipedia, the free encyclopedia

scikit-learn

Original author(s)	David Cournapeau
Initial release	June 2007 (2007-06)^[1]
Stable release	0.13.1 / February 23, 2013 (2013-02-23)
Preview release	0.14a1 / July 29, 2013 (2013-07-29)
Written in	Python, Cython, C and C++
Operating system	Linux, Mac OS X, Microsoft Windows
Type	Library for machine learning
License	BSD License
Website	scikit-learn.org

scikit-learn (formerly scikits.learn) is an open source machine learning library for the Python programming language.^[2] It features various classification, regression and clustering algorithms including support vector machines, logistic regression, naive Bayes, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.

Overview

The scikit-learn project started as scikits.learn, a Google Summer of Code project by David Cournapeau. Its name stems from the notion that it is a "SciKit" (SciPy Toolkit), a separately-developed and distributed third-party extension to SciPy. The original codebase was later extensively rewritten by other developers. Of the various scikits, scikit-learn as well as scikit-image were described as "well-maintained and popular" in November 2012.^[3]

As of 2013, scikit-learn is under active development and is sponsored by INRIA and occasionally Google (through the Google Summer of Code).^[4] Among its users is Evernote, which uses the library to distinguish recipes from other user posts through a naive Bayes classifier,^[5] and Mendeley, which builds recommender systems from scikit-learn's SGD regression algorithm.^[6] The Python Natural Language Toolkit (NLTK) includes a wrapper to allow use of scikit-learn through the nltk.classify API.^[7]

The scikit-learn API has been adopted by wise.io, who offer a proprietary implementation of random forests called wiseRF.^[8]^[9] wise.io's business partner Continuum IO claimed data throughput of up to 7.5 times that of scikit-learn's implementation;^[10] since then, the scikit-learn developers claim to have optimized their implementation to be competitive with wise.io's, except in terms of memory use.^[11]

Implementation

scikit-learn is largely written in Python, with some core algorithms written in Cython to achieve performance. Support vector machines are implemented by a Cython wrapper around LIBSVM. Logistic Regression and Linear support vector machines are implemented by a Cython wrapper around LIBLINEAR.

References

↑ "Welcome to the SciPy Toolkits". 7 October 2009. Retrieved 7 June 2013.
↑ Fabian Pedregosa; Gaël Varoquaux; Alexandre Gramfort; Vincent Michel; Bertrand Thirion; Olivier Grisel; Mathieu Blondel; Peter Prettenhofer; Ron Weiss; Vincent Dubourg; Jake Vanderplas; Alexandre Passos; David Cournapeau (2011). "Scikit-learn: Machine Learning in Python". Journal of Machine Learning Research 12: 2825–2830.
↑ Eli Bressert (2012). SciPy and NumPy: an overview for developers. O'Reilly. p. 43.
↑ "About Us". http://scikit-learn.org. Retrieved 3 May 2013.
↑ Mark Ayzenshtat (22 January 2013). "Stay classified". Evernote Techblog. Retrieved 4 May 2013.
↑ Mark Levy (2013). "Efficient Top-N Recommendation by Linear Regression". ACM RecSys Large Scale Recommender System workshop.
↑ "scikitlearn Module". NLTK 2.0 Documentation. Retrieved 4 May 2013.
↑ "wiserf". wise.io. Retrieved 22 January 2014.
↑ Buitinck, Lars, Gilles Louppe, Mathieu Blondel, Fabian Pedregosa, Andreas Mueller, Olivier Grisel, Vlad Niculae et al. (2013). "API design for machine learning software: experiences from the scikit-learn project". ECML PKDD Workshop on Languages for Machine Learning.
↑ Joseph W. Richards (27 November 2012). "wiseRF Use Cases and Benchmarks". Continuum IO. Retrieved 22 January 2014.
↑ Gaël Varoquaux (8 August 2013). "Scikit-learn 0.14 release: features and benchmarks". Retrieved 22 January 2014.

External links

v t e Scientific software in Python

NumPy SciPy matplotlib pandas scikit-learn MayaVi more

This article is issued from Wikipedia. The text is available under the Creative Commons Attribution/Share Alike; additional terms may apply for the media files.

Scikit-learn

Overview

Implementation

See also

References

External links