Optimization of classification trees: strategy and algorithm improvement

2019-12-06T07:43:25Z (GMT) by Martin Kröger
Abstract We present a new version of the program MedTree which is of use to calculate classification trees beyond the quality of trees which are based on direct evaluation of a splitting criterion. MedTree calculates a large number of possible segments of trees and recursively selects the best of these parts to form an ‘optimal’ tree which requires the discussion of the definition of ‘optimal’. The characteristics of the improved version are as follows: analysis of incomplete data, higher performance,... Title of program: MedTree 4.1 Catalogue Id: ADCY_v2_0 [ADEM] Nature of problem The problem is to find best trees of classification for a specific subject to one of two groups [1,2]. Initially, a set of features for a (sufficient) large number of representative subjects from both groups must be sampled by the user. A good tree is expected to be found if there exist schemes of behaviour, or even complex correlations within the input information. The algorithm allows to take into account boundary conditions, to fit the practical purpose of the classification tree. Versions of this program held in the CPC repository in Mendeley Data ADCY_v1_0; MedTree 3.1; 10.1016/0010-4655(96)00002-1 ADCY_v2_0; MedTree 4.1; 10.1016/S0010-4655(96)00123-3 This program has been imported from the CPC Program Library held at Queen's University Belfast (1969-2019)