Example data and definition files are provided in the
examples\datafiles directory. These files may be used by any bean.
Sometimes companion data is provided for testing with a suffix of
Test.
The following sample data sources are provided:
- Animal
- This file contains descriptions of seven different
animals and is used for supervised learning techniques.
The file with the Test suffix can be used to
verify training. The file with the suffix Test2
contains an additional record designed to illustrate the
need to provide all possible values during training and
provide the opportunity for the user to create a custom
filter to put more credance in the weight field.
- Uses: AbleNeuralClassifierAgent,
AbleDecisionTree.
- Statistics: 7 records; inputs: 5
categorical, 1 discrete, 1 continuous; outputs: 1
categorical.
- BondRating
- This file contains data about various bond issues and can
be used to rank the most promising issues using a fuzzy
ruleset.
- Uses: AbleFuzzyRuleSet,
AbleDecisionTree, AbleSelfOrganizingFeatureMap.
- Statistics: 37 records; inputs: 4
continuous; outputs: 0.
- coc1
- This file is a binary dataset used for supervised
learning techniques.
- Uses: AbleNaiveBayes.
- Statistics: 1600 records; inputs: 59
discrete; outputs: 1 discrete.
- CoinIdentifier
- This file contains descriptions of five different United
States coins and is used for supervised learning
techniques.
- Uses: AbleNeuralClassifierAgent,
AbleDecisionTree.
- Statistics: 5 records; inputs: 6
categorical; outputs: 1 categorical.
- ColorScore
- Each record in this file contains two colors, a numeric
indicator, and the resultant score. This example is
useful for supervised learning beans for classification.
- Uses: AbleDecisionTree.
- Statistics: 15 records; inputs: 2
categorical, 1 continuous; outputs: 1 categorical.
- CreditScore
- This file contains credit and financial history for
individuals and is used to calculate credit limits. This
example is useful for supervised learning beans for
classification. If modified for unsupervised learning, it
could be used to categorize an individual's credit
worthiness by assigning them to a cluster with similar
characteristics.
- Uses: AbleNeuralClassificationAgent,
AbleNeuralPredictionAgent.
- Statistics: 148 records; inputs: 2
categorical, 3 continuous, 1 discrete; outputs: 1
categorical.
- DiscountCustomer
- Each record in this file contains a customer transaction
request consisting of the customer id and name, item to
purchase, and list price. This data is used by the
DiscountAgent example for CommonRules.
- Uses: AbleCommonRules.
- Statistics: 2 records; inputs: 3
categorical, 2 continuous, 1 discrete; outputs: 0.
- MarketAnalysis
- This file contains past records of purchases by customers
and some information about that customer. It is used to
cluseter similar customers into groups for target
marketing. It is used for unsupervised learning.
- Uses: AbleNeuralClusteringAgent,
AbleSelfOrganizingFeatureMap.
- Statistics: 300 records; inputs: 1
categorical, 5 continuous; outputs: 0.
- Medical
- The records in this file represent patient symptoms. It
is used to diagnose medical conditions using a fuzzy
ruleset.
- Uses: AbleFuzzyRuleSet.
- Statistics: 12 records; inputs: 9
categorical, 1 continuous; outputs: 0.
- Mortgage
- This file contains information describing mortgage
requests and their dispositions. It is used for
supervised learning techniques. The file with the Test
suffix can be used to verify training.
- Uses: AbleNeuralClassifierAgent,
AbleFuzzyRuleSet.
- Statistics: 30 records; inputs: 1
categorical, 1 discrete, 9 continuous, 1 ignore; outputs:
6 categorical.
- PolicyTest1
- This file contains high level descriptors of system load
and corresponding status for use in setting service level
policies. The agent using it is an example of multiple
rule sets.
- Uses: AbleRuleSet.
- Statistics: 12 records; inputs: 2
categorical; outputs: 0.
- SalesForecast
- Each record represents a store's sales on a particular
day. The objective is to forecast the change in sales for
the following day.
- Uses: AbleBackPropagation (recurrent),
AbleRadialBasisFunctionNet.
- Statistics: 365 records; inputs: 2
continuous, 2 discrete; outputs: 1 continuous.
- SystemManagement
- Each record in this file contains a snapshot of a
system's performance values at a point in time. This
example is useful for supervised learning beans for
classification. The output class is the system
utilization level - one of idle, underused, normal,
overused, or danger.
- Uses: AbleBackPropagation.
- Statistics: 15 records; inputs: 4
categorical, 2 discrete, 13 continuous; outputs: 1
categorical.
- Tdwalk
- Each record in this file is one step in a sequence. The
boolean value true indicates the object location. When
the object reaches the leftmost or rightmost position,
the sequence ends. Because of the unique flags that
indicate the start, middle, and end of the sequence, this
data is useful only for temporal difference learning.
- Uses: AbleTemporalDifferenceLearning.
- Statistics: 2008 records; inputs: 6
continuous; outputs: 1 continuous.
- Trout
- Each record contains the attributes describing the
appearance of one of nine Wyoming game fish. This example
is used by supervised learning beans such as neural
classification and decision tree. Ignore the fish type
field if to be used in unsupervised learning. The
first 100 records can be used for training, and the last
25 for testing. Within each of those groups, the
records are sorted by fish type so some algorithms may
wish to use the randomize feature of the AbleImport bean.
- Uses: AbleNeuralClassifierAgent (with
randomize), AbleNeuralPredictionAgent, AbleDecisionTree,
AbleFuzzyRuleSet.
- Statistics: 125 records; inputs: 5
categorical, 2 continuous, 1 discrete; outputs: 1
categorical.
- Vehicle
- This file contains data describing various vehicles. It
can be used to for either forward or backward chaining in
boolean rulesets.
- Uses: AbleRuleSet.
- Statistics: 7 records; inputs: 3
categorical, 2 discrete; outputs: 0.
- xor
- Each record in this file contains two binary inputs and
the result of the exclusive OR operation applied to those
inputs. This example is useful for supervised learning
beans for classification.
- Uses: AbleNeuralPredictionAgent.
- Statistics: 4 records; inputs: 2
continuous; outputs: 1 continuous.
-