The Tree Pruning is used when the decision tree is built, because of the in Tree Pruning the remove the branches which reflect anomalies in the training data set's due to noise or outliers. The Tree Pruning methods address the anomalies in the training data set's of over fitting the data. In Tree Pruning method by using statistical measures remove the least reliable branches, as a result the faster classification and an improvement in the ability of tree to correctly classify independent test data's.
Finally the main question is the "How does tree pruning is work?"
There are two common approaches to tree pruning: Prepruning and Postpruning.
In the prepruning approach, a tree is 'Pruned' by halting its construction early (Example, by deciding not to further split or partition the subset of training samples at a given node). Upon halting, the node becomes a leaf. The leaf may hold the most frequent class among the subset samples or the probability distribution of those samples. If partitioning the samples at a node would result in a split that falls below a pre-specified threshold value, then further partitioning of the given subset is halted. There are many difficulties, however, in choosing an appropriate threshold standard. High thresholds could result in over-simplified trees, while low thresholds could result in very little simplification.
In Postpruning approach we removes branches from a fully grown tee. A tree node in pruned by removing it's branches. The cost complexity pruning algorithm is an example of the Postpruning approach. The pruned node becomes a leaf and is labeled by the most frequent class among it's former branches. For each non-leaf node in the tree, the algorithm calculates the expected error rate that would occur if the sub-tree at that node were pruned.
By using Tree Pruning approach we can minimizes the expected error rate is preferred. We can prune trees based on the number of bits required to encode them.
The Prepruning and Postpruning may be interleaved for a combined approach. Postpruning requires more computation than Prepruning.
More articles: Bonsai