Help Topics     Concepts     Package     Class


Decision Tree Classifier Agent Properties and Use

Properties

The Decision Tree Classifier Agent panel provides these options:

Training File Name
Enter the name of a definition file. Use the Browse button to select a definition file. This file will be used to specify some decision tree parameters, such as number of attributes/fields present in each example, number of possible classes, and number of records in the input file. One of the fields must have a field name of class to support supervised learning. The data itself must be a file by the same name with a .dat extension. This will be used when the mode is Train and also to Generate Beans.
Testing File Name
Enter the name of a definition file with the same layout as the training definition file. Use the Browse button to select a definition file. This will be used when you set the mode to Test or Run.
Metric parameter
Represents the metric used for selecting the best attribute at each node. It could be one of the following: gainRatio (0), chiSquare (1), laplace (2), gini (3), relief (4), contextInfo (5), intraInterDist (6). The metric can be changed between different runs of the algorithm.
Discretization parameter
Represents the number of intervals used for to discretize continuous attributes. The discretization parameter can be changed between different runs of the algorithm.
numAttributes
Represents the number of attributes. This property is set when the decision tree bean is created based on the data in the input file, and cannot be changed later.
numClasses
Represents the possible classes for a particular data set. This property is set when the decision tree bean is created based on the data in the input file, and cannot be changed later.
numRecords
Represents the number of records in the input file. This property is set when the decision tree bean is created based on the data in the input file, and cannot be changed later.
Agent Mode
Select one of the following agent modes:
Train implies that the data is read from the Import corresponding to the input file and used to construct a decision tree model.
Test implies that the decision tree model is used to classify new data and compute the accuracy of classification based on known labels of test data. 
Run implies that the model is used to classify new data as it is processed. No feedback about accuracy can be given in this case, as the correct labels are not known.

Use

The Decision Tree Classifier Agent panel is used to generate an agent containing import beans for training and testing, a decision tree bean, filter beans to translate decision tree inputs and outputs, and data connections. The Agent Mode is set so that the decision tree bean can be trained (data is used to construct the decision tree model), test or run on a different test file.

Steps in using the decision tree classifier agent include:

  1. Enter the name of the training file that defines the record layout and name of the training data source. If the file is read successfully, the Generate Beans button should be enabled. 
  2. Select Train for the Agent Mode.
  3. Set the metric and discretization values.
  4. Press the Generate Beans button. The Reset button should now be enabled. Also the mode should be set to test because training data was already used to construct the decision tree model.
  5. Change the mode to run if necessary..
  6. Press OK button to set the changed values.
  7. At some point you may wish to press the Reset Beans button to re-initialize all the beans to their initial default values.