Help Topics
Concepts Package
Class
The Import Bean panel provides these options:
- Definition file
- Enter the path and name of a definition file. Use the Browse
button to select a definition file. This file can be used
to generate a Filter
bean The data itself must be a file by the same name
with an extension of ".dat".
- Delimiters
- Select one of the following delimiters:
Spaces spaces are used between fields.
Commas are used between fields.
Tabs characters are used between fields.
- Record cache count
- Enter the number of records to be read from the file and
stored in memory. To read the entire file and store in
memory, select All.
- Randomize data
- Check this box if the file data should be presented in a
random sequence. For some beans such as BackPropagation,
this can be beneficial; for other beans such as Temporal
Difference Learning the record sequence is critical.
- Steps per cycle
- Enter the number of records to be processed in one cycle.
- Relative to file size
- Check this box if the Steps per cycle value is to be
used in conjunction with the number of records in the file.
When not selected, the Steps per cycle value is the absolute
number of records in a cycle. When selected, the value is
multiplied by the file size to obtain the number of records
in a cycle. This allows the cycle size to be proportionate to
the number of records in a file - 0.5 means half the records
in the file should be processed for each cycle, 2.0 means each
record in the file should be processed twice in one cycle.
- Compute field statistics
- Check this box if the field statistics are to be computed
when the file is opened. When field statistics are
computed, the definition file is read to determine the
record layout for the file. The values in each field are
examined. For each discrete and categorical field, the
set of unique values is determined. For continuous
fields, the minimum, maximum, and mean are determined.
When field statistics are not computed, the record layout
and statistics used to read the file are 'borrowed' from
another data source. Starting with the active data
source, the container agent's list of data sources is
scanned until field layout and statistics data is located
from a previously opened data source. If none is found,
the definition file will be read and statistics computed
in spite of the setting.
-
The Import Bean panel is used to specify and operate on a file
from which data is to be read. It can generate a filter bean to convert its
values which are usually passed to a bean such as a neural
network.
Steps in using the panel include:
- Enter the path and name of the import definition file
from which to read data.
- Select Spaces or Commas as present in the
data file.
- Select the number of records to cache in memory.
- Check Randomize data if beneficial to the
receiving bean.
- If this file is a test data source, has previously been
opened with statistics computed, or for some other reason
its layout and statistics is to be derived from another
data source in the container, uncheck the Compute
field statistics box.
- Press the Open File button. This will open the
file, count the number of records it contains, and read
the number of records specified into memory.
- Press the Generate Filters button to create a
translate filter that will convert the import data to
numeric values, and a translate filter that will revert
numeric values for output fields to the original domain.
The convert filter will be connected to the import bean.