Cross Industry Standard Process for Data Mining

From Wikipedia, the free encyclopedia

CRISP-DM stands for CRoss Industry Standard Process for Data Mining^[1]. It is a data mining process model that describes commonly used approaches that expert data miners use to tackle problems.

1 Major phases
2 History
3 CRISP-DM 2.0
4 Advantages
5 References
6 External links

[edit] Major phases

CRISP-DM breaks the process of data mining into six major phases^[2]:

Business Understanding
Data Understanding
Data Preparation
Modeling
Evaluation
Deployment

[edit] History

CRISP-DM began as a European Union project under the ESPRIT funding initiative. The project was led by four companies: ISL, NCR, Daimler-Benz and OHRA.

This core consortium brought different experiences to the project: ISL, later acquired and merged into SPSS Inc. . NCR computer giant produced the Teradata datawarehouse and its own data mining software. Daimler-Benz (now DaimlerChrysler) had a significant data mining team. OHRA, an insurance company, was just starting to explore the potential use of data mining.

The first version of the methodology was released as CRISP-DM 1.0 in 1999.

[edit] CRISP-DM 2.0

In July 2006 the consortium announced that it was going to start the process of working towards a second version of CRISP-DM. On 26 September 2006, the CRISP-DM SIG met to discuss potential enhancements for CRISP-DM 2.0 and the subsequent roadmap.