Self-Organizing Map Bean
The self-organizing feature map (SOFM) develops a probability density
map of the input vector space that is useful for data reduction. It builds
a map that projects from an n-dimensional input space onto a two-dimensional
output space. This output space is constrained to a rectangular area. The
training data consists of real-value input vectors.
Implementation
Feature maps are implemented as described in papers by Kohonen and DeSieno.
The ABLE implementation allows all units to have equal probability of winning.
Architecture
The inputs for this algorithm are simply the dimensionality of the input
data. However, deciding how to set up the output layer is a crucial part
of using feature maps.
The number of output units becomes the major factor in how a map self-organizes.
Each output unit responds to a hypersphere in the input space. The number
of output units required depends on whether you primarily want data reduction
or the topological mapping function.
All Able-provided neural networks share some
common attributes.
The following addition parameters relate to a self-organizing feature map
network:
Architecture parameters
-
Number of inputs
-
Sets the number of units allocated for the input layer. This must be an
integer value between 1 and 2000.
-
Number of output rows
-
Sets the number of rows for the units allocated for the output layer. This
must be an integer value between 1 and 100. For example, if you are projecting
to a single dimension, you would set output rows equal to 1 and output
columns equal to the number of categories.
-
Number of output columns
-
Sets the number of columns for the units allocated for the output layer.
This must be an integer value between 1 and 100. For example, if you are
projecting to a two-dimensional output space, you can set the output to
be any M-by-N rectangular space.
Other Parameters
-
Winner
-
Is the index of the winning output unit.
-
Winner Activation
-
The activation value of the winning processing unit.
-
Learn Rate
-
Controls how much the weights are changed during a weight update. The larger
the value, the more the weights are changed. This must be a real value
between 0.0 and 10.0.
-
Beta
-
Is usually set to a small value like 0.0001. Use it to calculate the percentage
of time that a unit has won the competition. This must be a real value
between 0.0 and 1.0.
-
Conscience
-
Represents the bias factor that determines the distance a losing unit can
reach in order to win a competition. This must be a real value between
0.0 and 1.0.
-
Neighborhood
-
Controls how many units surrounding the closest unit to the input pattern
are modified during a weight update. This must be an integer between 0
and the total of the number of output rows and columns. The neighborhood
parameter is set to 1, then every unit adjacent to the winner has its weights
changed. A rule of thumb is that the value for the neighborhood parameter
should be one-third the maximum number of output rows and columns.
-
Learn Rate Delta
-
Controls how much the learn rate parameter is decreased after each training
epoch. This must be a real value between 0.0 and 1.0.
-
Neighborhood Rate
-
Controls how many epochs must be presented to the network before the neighborhood
parameter value is decreased by one. This must be a real value between
0.0 and 10 000.0. The size of this parameter depends on the number of input
patterns in your training data set. Approximately 1000 patterns should
be presented to the network before the neighborhood is decremented.
-
Initial Learn Rate
-
The initial learn rate used whenever the network is reset. This value is
then decremented during training.
-
Initial Neighborhood
-
The initial neighborhood value used whenever the network is reset. This
value is then decremented during training.
-
Maximum Number of Epochs
-
The maximum number of epochs used during training the network. If set to
zero, training continues until a breakpoint is reached or otherwise halted.
Values range from 0 to 10 000 with a default of zero. A positive value
halts training after processing that number of epochs.
-
Sigma
Training
The training data consists of real-value input vectors scaled to values
between 0.0 and 1.0. Several parameters must be adjusted to ensure good
training of a feature map. The implementation requires a bias factor called
Conscience,
which allows losing units to win sometimes, even though they are not the
closest units to the input vector. The Conscience term determines how far
a losing output unit can be from the input and still be declared the winner.
Use the constant term beta to calculate the percentage of time that
a unit wins the competition. It should be set to a very small value between
0.001 and 0.0001.
Running
Feature maps are very fast in run time, requiring only a single multiplication
of the input vector by the weight matrix.