Help Topics     Concepts     Package     Class


Import Bean Properties and Use

Properties

The Import Bean panel provides these options:

Definition file
Enter the path and name of a definition file. Use the Browse button to select a definition file. This file can be used to generate a Filter bean The data itself must be a file by the same name with an extension of ".dat".
Delimiters
Select one of the following delimiters:
Spaces spaces are used between fields.
Commas are used between fields.
Tabs characters are used between fields.
Record cache count
Enter the number of records to be read from the file and stored in memory. To read the entire file and store in memory, select All.
Randomize data
Check this box if the file data should be presented in a random sequence. For some beans such as BackPropagation, this can be beneficial; for other beans such as Temporal Difference Learning the record sequence is critical.
Steps per cycle
Enter the number of records to be processed in one cycle.
Relative to file size
Check this box if the Steps per cycle value is to be used in conjunction with the number of records in the file. When not selected, the Steps per cycle value is the absolute number of records in a cycle. When selected, the value is multiplied by the file size to obtain the number of records in a cycle. This allows the cycle size to be proportionate to the number of records in a file - 0.5 means half the records in the file should be processed for each cycle, 2.0 means each record in the file should be processed twice in one cycle.
Compute field statistics
Check this box if the field statistics are to be computed when the file is opened. When field statistics are computed, the definition file is read to determine the record layout for the file. The values in each field are examined. For each discrete and categorical field, the set of unique values is determined. For continuous fields, the minimum, maximum, and mean are determined. When field statistics are not computed, the record layout and statistics used to read the file are 'borrowed' from another data source. Starting with the active data source, the container agent's list of data sources is scanned until field layout and statistics data is located from a previously opened data source. If none is found, the definition file will be read and statistics computed in spite of the setting.
 

Use

The Import Bean panel is used to specify and operate on a file from which data is to be read. It can generate a filter bean to convert its values which are usually passed to a bean such as a neural network.

Steps in using the panel include:

  1. Enter the path and name of the import definition file from which to read data.
  2. Select Spaces or Commas as present in the data file.
  3. Select the number of records to cache in memory.
  4. Check Randomize data if beneficial to the receiving bean.
  5. If this file is a test data source, has previously been opened with statistics computed, or for some other reason its layout and statistics is to be derived from another data source in the container, uncheck the Compute field statistics box.
  6. Press the Open File button. This will open the file, count the number of records it contains, and read the number of records specified into memory.
  7. Press the Generate Filters button to create a translate filter that will convert the import data to numeric values, and a translate filter that will revert numeric values for output fields to the original domain. The convert filter will be connected to the import bean.