Randomized ensemble methods for classification trees

Loading...
Thumbnail Image
Authors
Kobayashi, Izumi
Advisors
Buttrey, Samuel E.
Second Readers
Subjects
Classification
Ensemble methods
Date of Issue
2002-09
Date
September 2002
Publisher
Monterey, California. Naval Postgraduate School
Language
Abstract
We propose two methods of constructing ensembles of classifiers. One method directly injects randomness into classification tree algorithms by choosing a split randomly at each node with probabilities proportional to the measure of goodness for a split. We combine this method with a stopping rule which uses permutation of the outputs. The other method perturbs the output and constructs a classifier using the perturbed data. In both methods, the final classifier is given by an unweighted vote of the individual classifiers. These methods are compared with bagging, Adaboost, and random forests on thirteen commonly used data sets. The results show that our methods perform better than bagging, and comparably to Adaboost and random forests on average. Additional computation shows that our perturbation method could improve its performance by perturbing both the inputs and with the outputs, and combining a sufficiently large number of trees. Plots of strength and correlation show an interesting relationship. We also explore combining sampling subsets of the training set with our proposed methods. The results of a few trials show that the performance of our proposed methods could be improved by combining sampling subsets of the training set.
Type
Thesis
Description
Series/Report No
Department
Operations Research
Organization
Identifiers
NPS Report Number
Sponsors
Funding
Format
xviii, 121 p. : ill. ; 28 cm.
Citation
Distribution Statement
Approved for public release; distribution is unlimited.
Rights
This publication is a work of the U.S. Government as defined in Title 17, United States Code, Section 101. Copyright protection is not available for this work in the United States.
Collections