PCTBagging: From inner ensembles to ensembles. A trade-off between discriminating capacity and interpretability
Ibarguren I., Pérez J.M., Muguerza J., Arbelaitz O., Yera A.
The use of decision trees considerably improves the discriminating capacity of ensemble classifiers. However, this process results in the classifiers no longer being interpretable, although comprehensibility is a desired trait of decision trees. Consolidation (consolidated tree construction algorithm, CTC) was introduced to improve the discriminating capacity of decision trees, whereby a set of samples is used to build the consolidated tree without sacrificing transparency. In this work, PCTBagging is presented as a hybrid approach between bagging and a consolidated tree such that part of the comprehensibility of the consolidated tree is maintained while also improving the discriminating capacity. The consolidated tree is first developed up to a certain point and then typical bagging is performed for each sample. The part of the consolidated tree to be initially developed is configured by setting a consolidation percentage. In this work, 11 different consolidation percentages are considered for PCTBagging to effectively analyse the trade-off between comprehensibility and discriminating capacity. The results of PCTBagging are compared to those of bagging, CTC and C4.5, which serves as the base for all other algorithms. PCTBagging, with a low consolidation percentage, achieves a discriminating capacity similar to that of bagging while maintaining part of the interpretable structure of the consolidated tree. PCTBagging with a consolidation percentage of 100% offers the same comprehensibility as CTC, but achieves a significantly greater discriminating capacity.