Cross Industry Standard Process for Data Mining
From Wikipedia, the free encyclopedia
CRISP-DM stands for CRoss Industry Standard Process for Data Mining[1]. It is a data mining process model that describes commonly used approaches that expert data miners use to tackle problems.
Contents |
[edit] Major phases
CRISP-DM breaks the process of data mining into six major phases[2]:
- Business Understanding
- Data Understanding
- Data Preparation
- Modeling
- Evaluation
- Deployment
[edit] History
CRISP-DM began as a European Union project under the ESPRIT funding initiative. The project was led by four companies: ISL, NCR, Daimler-Benz and OHRA.
This core consortium brought different experiences to the project: ISL, later acquired and merged into SPSS Inc. . NCR computer giant produced the Teradata datawarehouse and its own data mining software. Daimler-Benz (now DaimlerChrysler) had a significant data mining team. OHRA, an insurance company, was just starting to explore the potential use of data mining.
The first version of the methodology was released as CRISP-DM 1.0 in 1999.
[edit] CRISP-DM 2.0
In July 2006 the consortium announced that it was going to start the process of working towards a second version of CRISP-DM. On 26 September 2006, the CRISP-DM SIG met to discuss potential enhancements for CRISP-DM 2.0 and the subsequent roadmap.
[edit] Advantages
- Industry neutral
- Tool neutral
- Closely related to KDD Process Model
- Anchors the data mining process
[edit] References
- ^ Shearer C. The CRISP-DM model: the new blueprint for data mining. J Data Warehousing 2000;5:13—22.
- ^ Harper, Gavin; Stephen D. Pickett (August 2006). "Methods for mining HTS data". Drug Discovery Today 11 (15-16): 694–699. doi: .
[edit] External links
- CRoss Industry Standard Process for Data Mining
- CRoss Industry Standard Process for Data Mining Blog