Module Design
This section discusses the design of the MLP module, notably how it can be initialized from data and its runtime C++ interface.
Initialization
The initialization is left very high level; only the topology of the MLP is specified. This includes the number of inputs and the description of each layer with its number of units. In XML, this could look a bit like this:
<layer inputs="4" units="8"/>
<layer units="1"/>
The number of inputs for the layers is only specified once, because only the inputs for the first layer are unknown; these are the inputs to the perceptron. The inputs to the other layers are the outputs of the previous layers, so they can be assumed. If specified, the other input values need to be internally consistent.
The perceptron module also has the capability to save itself to disk. Because developers do not need to (and often cannot) manipulate the data manually, there is no need to describe the format. We can just assume that the implementation of the saving and the loading are internally consistent.
Interfaces
There are two kinds of interfaces: those expected to read the perceptron's data (simulation), and those intended to update the perceptron's data (training). Separating these two interfaces means the implementation can be made lighter when only simulation is required—a popular feature for game developers.
The interface for simulation consists of only one function. This takes the network's inputs and produces the outputs. The interfaces use the STL for simplicity and safety, but this can be modified trivially if it is a problem:
void Run(const vector< float >& input, vector< float >& output);
Two kinds of interfaces update the data: incremental training and batch learning. The incremental learning interface consists of two functions. One function deals with individual samples, defined much like the simulation function, but provided with a variety of default parameters (for instance, a learning rate of 0.1, momentum of 0). The other function randomizes the network:
float Sample(const vector<float>& input, const vector<float>& output);
void Randomize();
In contrast, batch algorithms tend to use more memory than incremental learning techniques, so they are kept separate in the implementation; more memory will be used only if batch training is required.
The batch training procedure actually takes entire arrays of input and output patterns. These are provided along with default number of iterations and error threshold:
float Batch(const vector<Pattern>& inputs, const vector<Pattern>& outputs);
The perceptron can be implemented transparently behind these interfaces.
|