BIB-VERSION:: CS-TR-v2.0 ID:: STAN//CS-TR-97-1587 ENTRY:: March 24, 1997 ORGANIZATION:: Stanford University, Department of Computer Science TITLE:: Ensembles for Supervised Classification Learning TYPE:: Thesis TYPE:: Technical Report AUTHOR:: Matan, Ofer DATE:: March 1997 PAGES:: 100 ABSTRACT:: This dissertation studies the use of multiple classifiers (ensembles or committees) in learning tasks. Both theoretical and practical aspects of combining classifiers are studied. First we analyze the representational ability of voting ensembles. A voting ensemble may perform either better or worse than each of its individual members. We give tight upper and lower bounds on the classification performance of a voting ensemble as a function of the classification performances of its individual members. Boosting is a method of combining multiple "weak" classifiers to form a "strong" classifier. Several issues concerning boosting are studied in this thesis. We study SBA, a hierarchical boosting algorithm proposed by Schapire, in terms of its representation and its search. We present a rejection boosting algorithm that trades-off exploration and exploitation: It requires fewer pattern labels at the expense of lower boosting ability. Ensembles may be useful in gaining information. We study their use to minimize labeling costs of data and to enable improvements on performance over time. For that purpose a model for on-site learning is presented. The system learns by querying "hard" patterns while classifying "easy" ones. NOTES:: [Adminitrivia V1/Prg/19970324] END:: STAN//CS-TR-97-1587