A decision-tree learning algorithm builds a hypothesis or decision rule in DNF (Disjunctive Normal Form) represented as a tree graph. Each node of the tree is a test for a feature value (e.g., is feature f = value v?). Each path from the root of the tree to a leaf node is a rule made by the conjunction of all feature values found along the path. The leaf node specifies the class assigned to an example.
Training examples are represented as pairs (x,c), where x is a feature vector and c is the class assigned to x. Features can be nominal or numeric (numeric features are discretized previous to the learning phase). Decision-tree algorithms are popular due to their relative speed and for enabling interpretation of the hypothesis, which may give insight into how features correlate with respect to the target concept.
A decision tree is built by recursively partitioning the training set until examples on each partition belong to the same class. At each node the algorithm selects the best feature to divide the training set into regions that are class uniform. The algorithm is then applied recursively on each region (for details see "C4.5: Programs for Machine Learning" by J. R. Quinlan, Morgan Kaufmann, 1993).
A decision-tree algorithm operates in two phases. In the training or learning phase the algorithm recursively builds the decision tree in search for regions of example that are class uniform. The learning phase can only work in batch mode (some incremental versions of decision trees do exist).
In the testing phase, each example is run through the decision tree by following the path that makes the test at each node true. The example is then assigned the class of the last (leaf) node. Statistics such as percentCorrect are maintained.
When the tree is in Run mode, a pattern is run through the decision tree by following the path that makes the test at each node true. The example is then assigned the class of the last (leaf) node.
The following inspectable properties are available: