This topic is an introduction to neural network technology and its applications. It is intended to provide background material for those unfamiliar with neural network technology, not as a comprehensive technical treatment of the technology or its applications.
Each of the neural network models supported by the ABLE framework is described in the context of using the AbleBeans. They are not, and are not intended to be, complete theoretical descriptions of the models. The following neural network models are included:
The first work on neural networks was done by Frank Rosenblatt in the late 1950's and early 1960's. His work on the Perceptron is recognized as the beginning of this field. At the same time, Bernard Widrow and Ted Hoff were working on the ADELINE model. These early neural networks showed promise, but they were limited in their problem-solving abilities.
During the mid-1960's, Marvin Minsky and Seymour Pappert studied the Perceptron and showed that its learning capabilities were very limited. Their book, Perceptrons, hastened the decline of the first wave of neural network activity.
In the ensuing decades, several researchers continued work on neural networks. These included James Anderson, Leon Cooper, Stephen Grossberg, and Teuvo Kohonen. In the mid-1980's, interest in the field of neural networks revived, sparked by: advances in VLSI technology, which made multiple processors on a single chip possible; new learning algorithms, which made it possible to train multilayer networks; and a disenchantment with the symbolic artificial intelligence approach. David Rumelhart and James McClelland's Parallel Distributed Processing books did much to popularize the neural network approach.
The field of neural networks is a multidisciplinary area, with researchers from neurophysiology, cognitive science, mathematics, physics, electrical engineering, and computer science. While the work being done in the various disciplines uses different terminology, researchers see a common thread to their work: a reliance on massive parallelism and adaptive interconnections.
Perhaps the biggest difficulty in using neural networks to solve problems is the dramatic difference in the problem-solving approach. Neural networks require a major paradigm shift. It is not easy to ignore years of training and experience with serial, algorithmic approaches to problem-solving. Neural networks require approaching a problem from a pattern recognition (data driven) perspective, rather than from a control (algorithmic) standpoint.
Neural networks are a collection of processing units and adaptive connections that are designed to perform a specific processing function. They rely on a computing model that is loosely based on how the brain computes. The main features include a collection of many simple processors connected by adaptive connections. These simple processors sum up their inputs and calculate an output value, or activation. This output is then sent to other processing units in the neural network.
The connections are called adaptive because they are adjusted during the training of the neural network. This training process usually consists of presenting examples of input/output relationships to the network. The connection weights are adjusted to minimize the difference between the actual network output value and the desired output value.
The presentation of input/output pairs is called supervised training. Other types of neural networks are presented with input data only. This approach is called unsupervised, and the networks are called self-organizing neural networks. A third, less common, training technique is called reinforcement learning. This involves giving the neural network general feedback on whether its performance is good or bad.
In addition to processing units and connection weights, neural networks are classified by the type of connectivity between units. The most common form is the feed-forward network. Another type is the fully-connected or dynamic network. Other topologies using less than complete connectivity are also used.
The processing units are the neural network equivalents to biological neurons. They are extremely simple processors performing as integrators. Different neural network models use different types of processing units. The most common type simply sums all of its inputs multiplied by the input connection weights, subtracts a threshold or bias value, and passes that value through a nonlinear sigmoid function. Variations can include using linear or binary activation functions, decay mechanisms so that a unit slowly loses its activation value, and probabilistic factors to determine when a unit turns on. These types of units can exhibit a wide variety of behaviors and affect the performance of the entire neural network.
The action of a single unit can be viewed as calculating a pattern match between the input and its connection weights. If there is a strong match, the unit has a large output value. If there is a strong mismatch, the unit has a small output value. So, at a micro level, neural networks are acting as pattern-matching machines. While this is an interesting and useful function, it is not very powerful unless many units are used.
Sets of units are usually grouped together in layers to transform an array or vector of inputs in parallel. This amounts to doing a nonlinear vector transformation on the input vector. By using one or more layers of units, very powerful computations can be performed on the input data vector. The various neural network models use different connection topologies, along with specific unit types, to achieve their computational goals.
A neural network model consists of several parts. The unit type, connection topology (feed-forward, fully interconnected, or hybrid), and learning algorithm are all important aspects of each model. The learning algorithm is the controlling mechanism that determines when and how the adaptive interconnection weights are changed.
Neural networks are very good at pattern-recognition and pattern-matching tasks. This makes them useful for decision-support problems involving multiple variables and noisy or incomplete data. Many common business problems can be posed as pattern-recognition tasks. For example, many tasks performed by human "experts" can best be described as pattern-recognition. The experts are often presented with a problem or a set of data, and they simply recognize the problem and its solution.
The most popular neural network model, back propagation, is a feed-forward model using the generalized delta rule to adjust the weights. It is a pattern-matching network that can be used as a classifier, a time-series forecaster, or as a predictive model much like statistical regression models. It builds an internal model (held in its connection weights) of the relationships between the input and output training patterns. When it is subsequently given an input vector, it tries to reproduce the associated output vector. If the input is one it has never seen before, it produces an output similar to the one associated with the closest matching training input pattern. This generalization capability of producing a reasonable output, even for an input pattern it has never seen before, is one of the advantages of neural networks. It can learn from past experience to deal with new and unexpected situations.
The neural network models that use feedback can be thought of as dynamic systems. From this perspective, a neural network is given an input that disturbs the dynamics of the system. The network then cycles until it settles down into a stable state or equilibrium. If the network's weights are set so that the stable points are information-containing points, then this mechanism can be used for associative memory functions. Another way of looking at these systems is as constraint satisfaction or optimization machines. If the network settles down into a global minima, it is essentially computing the lowest energy or lowest cost function for the given problem.
The processing capabilities differ between the various neural network models. Some types of neural networks are best at pattern matching, while others can handle constraint satisfaction and optimization problems.
Neural networks are good at broad and flat transformation of data. That is, if you have a problem with fifty parameters and need to classify them into ten classes, then a neural network can probably do the job. If you have a ten-step algorithm with branch points determining which operation needs to be done next, then neural networks are not the tool you should use. However, it is not always apparent whether a seemingly serial process can be posed as a parallel pattern-matching type of problem. Quite often a mixture of traditional algorithmic code and neural networks are required to provide a complete problem solution. Fuzzy Rule Systems can be included to allow you to mix multi-step rule-based decision-making with neural networks using the ABLE framework.
Although each neural network model differs in the number of parameters and arrays used, and in the processing of the data, every AbleNetwork has the following parameters defined:
In addition to the standard parameters, each neural network model must have an Input array and an Output array that serve as the Network object input and output buffers respectively.
For more information about the neural networks provided with ABLE, refer to:
Back Propagation
Radial Basis Function
Self-Organizing Feature Maps
Temporal Difference Learning