Pruning (algorithm)
From Wikipedia, the free encyclopedia
This article may require cleanup to meet Wikipedia's quality standards. Please improve this article if you can. (May 2008) |
Pruning is a term in mathematics and informatics which describes a method of enumeration, which allows to cut parts of a decision tree. Pruned parts of the tree are no longer considered because the algorithm knows based on already collected data (e.g. through sampling) that these subtrees do not contain the searched object. The decision tree is simplified by removing some decisions.
Contents |
[edit] Introduction
Decision trees are built with the available information and they are supporting tools in a decision making process. However, sometimes the given set of data may be irrelevant, erroneous or unnecessary, in which case the decision tree formed may not be correct; this phenomenon is called overfitting. Also, too much information may misguide the decision policy. When this happens, it becomes necessary to remove the nodes from the decision tree. The process of making the decision tree smaller by removing nodes that are not helpful is called Pruning.
[edit] Illustration
Pruning is carried out as a step by step process. In general, it is performed as
1. Remove the interior node in the tree, a non-leaf node. 2. Evaluate the performance without the pruned part. 3. Repeat until no improvement is obtained from pruning 4. Retain the best performed tree as the decision tree.
Consider the decision tree as given below,
When a decision process is related to the selection of a flower, then the branch with the shape is irrelevant and not related to the flower branch. Hence that particular branch can be pruned to improve performance.
The above illustration is a simple example of pruning.
[edit] Further Study
Pessimistic Decision tree pruning based on Tree size [1] MDL based decision tree pruning Decision tree pruning using backpropagation neural networks
[edit] See also
[edit] References
[edit] External links
- [ http://www.cis.upenn.edu/~mkearns/papers/pruning.pdf] (Optimization Algorithm)
- [ http://www.math.tau.ac.il/~mansour/ml-course/scribe11.ps] (Introduction to Decision tree pruning)