Data Pre-processing

From Wikipedia, the free encyclopedia

This article or section needs to be wikified to meet Wikipedia's quality standards.
Please help improve this article with relevant internal links. (August 2007)

The introduction to this article provides insufficient context for those unfamiliar with the subject.
Please help improve the article with a good introductory style.

Many factors affect the success of Machine learning (ML) on a given task. The representation and quality of the instance data is first and foremost (Pyle, 1999). If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is more difficult. It is well known that data preparation and filtering steps take considerable amount of processing time in ML problems. Data pre-processing includes data cleaning, normalization, transformation, feature extraction and selection, etc. The product of data pre-processing is the final training set. Kotsiantis et al. (2006) present a well know algorithm for each step of data pre-processing.

[edit] References

S. Kotsiantis, D. Kanellopoulos, P. Pintelas, Data Preprocessing for Supervised Leaning, International Journal of Computer Science, 2006, Vol 1 N. 2, pp 111-117.
Pyle, D., 1999. Data Preparation for Data Mining. Morgan Kaufmann Publishers, Los Altos, CA.

Categories: Machine learning

Hidden categories: All pages needing to be wikified | Wikify from August 2007 | Wikipedia articles needing context | Wikipedia introduction cleanup

Data Pre-processing

From Wikipedia, the free encyclopedia

[edit] References

Views

Navigation

Interaction

Search