The radial basis function (RBF) network is a feed-forward neural network with a single layer of hidden units. Radial basis function networks are different from back propagation networks because they have only one layer of hidden units, and do not use the sigmoid activation function in the hidden layer unit. Instead, the radial basis function network has fixed-feature detectors in the hidden layer which use a specified basis function to detect and respond to localized portions of the input vector space. One advantage of radial basis networks over back propagation is that, if the input signal is non-stationary, the localized nature of the hidden layer response makes the networks less susceptible to "memory loss," or, as some would say, "weight loss."
The radial basis function as implemented has a single hidden layer of units which can use one of three basis functions: Gaussian, thin plate spline, or multi-quadratic. The basis centers or weight vectors are learned during an initial stage of self-organized learning, currently fixed at 15 epochs. Once the basis vectors are set, the output layer weights are adjusted using back propagation.
When creating a radial basis network, you must specify the following architecture parameters:
where v is the Euclidian Norm, the distance
between the Input vector and the hidden unit Center,
calculated as the square root of the sum of the squared
element differences:
sqrt(&Sigma.(Input[i] - Center[i])**2)
There are several parameters to set during the initial and final training phases. To train the basis weights, you must set the autocenter parameter to TRUE. When this parameter is set, only the hidden layer weights (the basis vectors) are adjusted. Training parameters which are used during this phase include the learn rate and tolerance for errors.
After the basis weights have been set or trained, a second training phase is required. This is quite similar to training a back propagation network with a single hidden layer. The learning-rate and momentum settings are used; modifying them can have a large effect on the training performance of the network. These values are commonly set from 0.5 to 0.7 for learning rate and 0.0 to 0.9 for momentum.
The error tolerance setting controls the final training process. If the data set contains binary targets (0,1), then the tolerance parameter is usually set to 0.1. This means that the output is considered "good" when it is within 0.1 of the desired output (that is, 0.9 for a 1, 0.1 for a 0). When every output is within the tolerance range of the desired output value, the network status is changed to LOCKED and weight updates are stopped.
You can also set the epoch update flag. If set to TRUE, the weights are changed only after every complete cycle through the training set. This is true gradient descent. When the epoch update flag is set to FALSE, weights are updated after each pattern is presented. For most problems, the network converges more quickly with the epoch update flag set to FALSE.
An additional factor in training a radial basis network is the order in which training patterns are presented. By setting the randomize flag on the Import object feeding the Network object, you can ensure that the network is presented with a random ordering of the training patterns. This is often useful in avoiding local minima and aids in training speed.
The run-time performance of a radial basis network is relatively fast. The input vector is propagated through the network by first multiplying it by basis vectors and then passing it through the basis function. This vector is then multiplied by the output weight matrix. Next it is passed through the selected basis function to produce the network outputs array which is returned to the application program.