Ver. 1.0
Updated 31-May-2015
Intro video (11 min): https://www.youtube.com/watch?v=yB43jj-wv8Q
-
Optimized for 2D image data -- input data can be read from .bmp image files
-
Neuron layers can be abstracted as 1D or 2D arrangements of neurons
-
Network topology is defined in a text file
-
Neurons in layers can be fully or sparsely connected
-
Selectable transfer function per layer
-
Adjustable or automatic training rate (eta)
-
Optional momentum (alpha) and regularization (lambda)
-
Convolution filtering and convolution networking
-
Standalone console program
-
Simple, heavily-commented code, suitable for prototyping, learning, and experimentation
-
Optional web-browser-based GUI controller
-
Graphic visualizations of individual layers
-
No dependencies! Just C++11 (and POSIX networking for the optional webserver interface)
Requirements
Compiling the source
How to run the digits demo
How to run the XOR example
GUI interface
How to use your own data
The 2D in neural2d
Convolution filtering
Convolution networking and pooling
Layer depth
Topology config file format
Topology config file examples
How-do-I X?
- How do I run the command-line program?
- How do I run the GUI interface?
- How do I disable the GUI interface?
- How do I use my own data instead of the digits images?
- How do I use a trained net on new data?
- How do I train on the MNIST handwritten digits data set?
- How do I change the learning rate parameter?
- Are the output neurons binary or floating point?
- How do I use a different transfer function?
- How do I define a convolution filter?
- How do I define convolution networking and pooling?
- How do the color image pixels get converted to floating point for the input layer?
- How can I use .jpg and .png images as inputs to the net?
- Why does the net error rate stay high? Why doesn't my net learn?
- What other parameters do I need to know about?
- C++-11 compiler (e.g., g++ on Linux)
- POSIX sockets (e.g., Cygwin on Windows) (only needed for the optional GUI)
- CMake 2.8.12 or later
- Compiles and runs on Linux, Windows, and probably Mac
We use CMake to configure the build system. First get the source code from the Gitub repository. If using the command line, the command is:
git clone https://github.com/davidrmiller/neural2d
That will put the source code tree into a directory named neural2d.
If you are using the CMake graphical interface, run it and set the "source" directory to the neural2d top directory, and set the binary output directory to a build directory under that, then click Configure and Generate. For example:
If you are using CMake from the command line, cd to the neural2d top level directory, make a build directory, then run cmake from there:
git clone https://github.com/davidrmiller/neural2d
cd neural2d
mkdir build
cd build
cmake ..
make
There is no "install" step. After the neural2d program is compiled, you can execute it or open the project file from the build directory.
On Windows, by default CMake generates a Microsoft Visual Studio project file in the build directory. On Linux and Cygwin, CMake generates a Makefile that you can use to compile neural2d. You can specify a different CMake generator with the -G option, for example:
cmake -G "Visual Studio 11 2012" ..
To get a list of available CMake generators:
cmake --help
If you get errors when compiling the integrated webserver, you can build neural2d without webserver support by running CMake with the -DWEBSERVER=OFF option, like this:
cmake -DWEBSERVER=OFF ..
On systems using Makefiles, in the build directory, execute:
make test
This will do several things: it will compile the neural2d program if necessary; it will expand the archive in image/digits/digits.zip into 5000 individual images; and it will then train the neural net to classify those digit images.
The input data, or "training set," consists of images of numeric digits. The first 50 look like these:
The images are 32x32 pixels each, stored in .bmp format. In this demo, the neural net is configured to have 32x32 input neurons, and 10 output neurons. The net is trained to classify the digits in the images and to indicate the answer by driving the corresponding output neuron to a high level.
Once the net is sufficiently trained, all the connection weights are saved in a file named "weights.txt".
If you are not using Makefiles, you will need to expand the archive in images/digits, then invoke the neural2d program like this:
neural2d ../images/digits/topology.txt ../images/digits/inputData.txt weights.txt
<
8000
/pre>
On systems using Makefiles, in the build directory, execute:
make test-xor
For more information about the XOR example, see this wiki page.
First, launch the neural2d console program in a command window with the -p option:
./neural2d topology.txt inputData.txt weights.txt -p
The -p option causes the neural2d program to wait for a command before starting the training. The screen will look something like this:
At this point, the neural2d console program is paused and waiting for a command to continue. Using any web browser, open:
http://localhost:24080
A GUI interface will appear that looks like:
Press Resume to start the neural net training. It will automatically pause when the average error rate falls below a certain threshold (or when you press Pause). You now have a trained net. You can press Save Weights to save the weights for later use.
At the bottom of the GUI window, a drop-down box shows the visualization options that are available for your network topology, as shown below. There will be options to display the activation (the outputs) of any 2D layer of neurons 3x3 or larger, and convolution kernels of size 3x3 or larger. Visualization images appear at the bottom of the GUI. You can mouse-over the images to zoom in.
If you are inputting data from image files, you'll need to prepare a set of BMP image files and an input data config file. The config file (named inputData.txt by default) is a list of image filenames to use as inputs to the neural net, and optionally the target output values for each image. The format looks like this example:
images/thumbnails/test-918.bmp -1 1 -1 -1 -1 -1 -1 -1 -1 -1
images/thumbnails/test-919.bmp -1 -1 -1 -1 -1 -1 -1 -1 1 -1
images/thumbnails/test-920.bmp -1 -1 -1 -1 -1 -1 1 -1 -1 -1
images/thumbnails/test-921.bmp -1 -1 -1 -1 -1 1 -1 -1 -1 -1
The path and filename cannot contain any spaces.
The path_prefix directive can be used to specify a string to be added to the front of all subsequent filenames, or until the next path_prefix directive. For example, the previous example could be written:
path_prefix = ../images/thumbnails/
test-918.bmp
test-919.bmp
test-920.bmp
test-921.bmp
If you are not using image files for input, you'll need to prepare an input config file (named inputData.txt by default) similar to the above but with the literal input values inside curly braces. For example, for a net with eight inputs and two outputs, the format is like this:
{ 0.32 0.98 0.12 0.44 0.98 1.2 1 -1 } -1 1
You'll also need a topology config file (named topology.txt by default). It contains a specification of the neural net topology (the number and arrangement of neurons and connections). Its format is described in a later section. A typical one looks something like this:
input size 32x32
layer1 size 32x32 from input radius 8x8
layer2 size 16x16 from layer1
output size 1x10 from layer2
Then run neuron2d (optionally with the web browser interface) and experiment with the parameters until the net is adequately trained, then save the weights in a file for later use.
If you run the web interface, you can change the global parameters from the GUI while the neural2d program is running. If you run the neural2d console program without the GUI interface, there is no way to interact with it while running. Instead, you'll need to examine and modify the parameters in the code at the top of the files neural2d.cpp and neural2d-core.cpp.
In a simple traditional neural net model, the neurons are arranged in a column in each layer:
In neural2d, you can specify a rectangular arrangement of neurons in each layer, such as:
The neurons can be sparsely connected to mimic how retinal neurons are connected in biological brains. For example, if a radius of "1x1" is specified in the topology config file, each neuron on the right (destination) layer will connect to a circular patch of neurons in the left (source) layer as shown here (only a single neuron on the right side is shown connected in this picture so you can see what's going on, but imagine all of them connected in the same pattern):
The pattern that is projected onto the source layer is elliptical. (Layers configured as convolution filters work slightly differently; see the later section about convolution filtering.)
Here are some projected connection patterns for various radii:
Any layer other than the input layer can be configured as a convolution filter layer by specifying a convolve-matrix specification for the layer in the topology config file. The neurons are still called neurons, but their operation differs in the following ways:
-
The connection pattern to the source layer is defined by the convolution matrix (kernel) dimensions (not by a radius parameter)
-
The connection weights are initialized from the convolution matrix, and are constant throughout the life of the net.
-
The transfer function is automatically set to the identity function (not by a tf parameter).
For example, the following line in the topology config file defines a 3x3 convolution matrix for shrarpening the source layer:
layerConv1 size 64x64 from input convolve {{0,-1,0},{-1,5,-1},{0,-1,0}}
When a convolution matrix is specified for a layer, you cannot also specify a radius parameter for that layer, as the convolution matrix size determines the size and shape of the rectangle of neurons in the source layer. You also cannot also specify a tf parameter, because the transfer function on a convolution layer is automatically set to be the identity function.
The elements of the convolution matrix are stored as connection weights to the source neurons. Connection weights on convolution layers are not updated by the back propagation algorithm, so they remain constant for the life of the net.
The results are undefined if a layer is defined as both a convolution layer and a regular layer.
For illustrations of various convolution kernels, see this Wikipedia article
In the following illustration, the topology config file defines a convolution filter with a 2x2 kernel that is applied to the input layer, then the results are combined with a reduced-resolution fully-connected pathway. The blue connections in the picture are the convolution connections; the green connections are regular neural connections:
input size 8x8
layerConvolve size 8x8 from input convolve {{-1,2},{-1,2}}
layerReducedRes size 4x4 from input
output size 2 from layerConvolve
output size 2 from layerReducedRes
A convolution network layer is like a convolution filter layer, except that the kernel participates in backprop training, and everything inside the layer is replicated N times to train N separate kernels. A convolution network layer is said to have depth N. A convolution network layer has depth * X * Y neurons.
Any layer other than the input or output layer can be configured as a convolution networking layer by specifying a layer depth > 1, and specifying the kernel size with a convolve parameter. For example, to train 40 kernels of size 7x7 on an input image of 64x64 pixels:
input size 64x64
layerConv size 40*64x64 from input convolve 7x7
. . .
A pooling layer down-samples the previous layer by finding the average or maximum in patches of source neurons. A pooling layer is defined in the topology config file by specifying a pool parameter on a layer.
In the topology config syntax, the pool parameter requires the argument "avg" or "max" followed by the operator size, For example, in a convolution network pipeline of depth 20, you might have these layers:
input size 64x64
layerConv size 20*64x64 from input convolve 5x5
layerPool size 20*16x16 from layerConv pool max 4x4
. . .
All layers have a depth, whether explicit or implicit. Layer depth is specified in the topology config file in the layer size parameter by an integer and asterisk before the layer size. If the depth is not specified, it defaults to one. For example:
-
size 10*64x64 means 64x64 neurons, depth 10
-
size 64x64 means 64x64 neurons, depth 1
-
size 1*64x64 also means 64x64 neurons, depth 1
-
size 10*64 means 64x1 neurons, depth 10 (the Y dimension defaults to 1)
The primary purpose of layer depth is to allow convolution network layers to train multiple kernels. However, the concept of layer depth is generalized in neural2d, allowing any layer to have any depth and connect to any other layer of any kind with any depth.
The way two layers are connected depends on the relationship of the source and destination layer depths as shown below:
Relationship | How connected |
---|---|
src depth == dst depth | connect only to the corresponding depth in source |
src depth != dst depth | fully connect across all depths |
Here is the grammar of the topology config file:
layer-name parameters := parameter [ parameters ]
parameters := parameter [ parameters ]
parameter :=
input | output | layername
size dxy-spec
from layer-name
channel channel-spec
radius xy-spec
tf transfer-function-spec
convolve filter-spec
convolve xy-spec
pool { max | avg } xy-spec
dxy-spec := [ integer * ] integer [ x integer ]
xy-spec := integer [ x integer ]
channel-spec := R | G | B | BW
transfer-function-spec := tanh | logistic | linear | ramp | gaussian | relu
filter-spec := same {{,},{,}} syntax used for array initialization in C, C#, VB, Java, etc.
Rules:
-
Comment lines that begin with "#" and blank lines are ignored.
-
The first layer defined must be named "input".
-
The last layer defined must be named "output".
-
The hidden layers can be named anything beginning with "layer".
-
The argument for "from" must be a layer already defined.
-
The color channel parameter can be specified only on the input layer.
-
If a size parameter is omitted, the size is copied from the layer specified in the from parameter.
-
A radius parameter cannot be used on the same line with a convolve or pool parameter.
-
The same layer name can be defined multiple times with different "from" parameters. This allows source neurons from more than one layer to be combined in one destination layer. The source layers can be any size, but the repeated (the destination) layer must have the same size in each specific 8000 ation. For example, in the following, layerCombined is size 16x16 and takes inputs from two source layers of different sizes:
input size 128x128
layerVertical size 32x32 from input radius 1x8
layerHorizontal size 16x16 from input radius 8x1
layerCombined from layerHorizontal <= assumes size 16x16 from the source layer
layerCombined size 16x16 from layerVertical <= repeated destination, must match 16x16
output size 1 from layerCombined
- In the xy-spec and in the X,Y part of the dxy-spec, you may specify one or two dimensions. Spaces are not allowed in the size spec. If only one dimension is given, the other is assumed to be 1. For example:
- "8x8" means 64 neurons in an 8 x 8 arrangement.
- "8x1" means a row of 8 neurons
- "1x8" means a column of 8 neurons.
- "8" means the same as "8x1"
Here are a few complete topology config files and the nets they specify.
input size 4x4
layer1 size 3x3 from input
layer2 size 2x2 from layer1
output size 1 from layer2
input size 4x4
layer1 size 1x4 from input
layer2 size 3x1 from layer1
output size 1 from layer2
input size 4x4
output size 4x1 from input radius 0x2
input size 16x16
layer1 size 4x4 from input radius 1x1
output size 7x1 from layer1
# In the picture that follows, layerVertical is the set of 4 neurons
# in the upper part of the picture, and layerHorizontal is the lower
# set of 4 neurons.
input size 6x6
layerHorizontal size 2x2 from input radius 2x0
layerVertical size 2x2 from input radius 0x2
output size 1 from layerHorizontal
output size 1 from layerVertical
# This example shows how vertical and horizontal image features can be
# extracted through separate paths and combined in a subsequent layer.
input size 4x4
layerH1 size 1x4 from input radius 4x0
layerH2 size 1x4 from layerH1
layerH3 size 1x4 from layerH2
layerV1 size 4x1 from input radius 0x4
layerV2 size 4x1 from layerV1
layerV3 size 4x1 from layerV2
output size 2 from layerV3
output size 2 from layerH3
How do I run the command-line program?
Run neural2d with three arguments specifying the topology configuration, input data configuration, and where to store the weights if training succeeds:
./neural2d topology.txt inputData.txt weights.txt
How do I run the GUI interface?
First launch the neural2d program with the -p option:
./neural2d topology.txt inputData.txt weights.txt -p
Then open a web browser and point it at http://localhost:24080 .
If your firewall complains, you may need to allow access to TCP port 24080.
How do I disable the GUI interface?
Run CMake with the -DWEBSERVER=OFF option. Or if you are using your own home-grown Makefiles, you can define the preprocessor macro DISABLE_WEBSERVER. For example, with gnu compilers, add -DDISABLE_WEBSERVER to the g++ command line. Alternatively, you can undefine the macro ENABLE_WEBSERVER in neural2d.h.
When the web server is disabled, there is no remaining dependency on POSIX sockets.
How do I use my own data instead of the digits images?
Create your own directory of BMP images, and a config file that follows the same format as shown in the provided default inputData.txt. Then define a topology config file with the appropriate number of network inputs and outputs, then run the neural2d program.
Or if you don't want to use image files for input, make an input config file containing all the literal input values and the target output values. The format is described in an earlier section.
How do I use a trained net on new data?
It's all about the weights file. After the net has been successfully trained, save the internal connection weights in a weights file. That's typically done in neural2d.cpp by calling the member function saveWeights(filename).
The weights you saved can be loaded back into a neural net of the same topology using the member function loadWeights(filename). Once the net has been loaded with weights, it can be used applied to new data by calling feedForward(). Prior to calling feedForward(), you'll want to set a couple of parameters:
myNet.repeatInputSamples = false;
myNet.reportEveryNth = 1;
This is normally done in neural2d.cpp.
You'll need to prepare a new input data config file (default name inputData.txt) that contains a list of only those new input data images that you want the net to process.
How do I train on the MNIST handwritten digits data set?
See the instructions in the wiki.
How do I change the learning rate parameter?
In the command-line program, you can set the eta parameter or change it by directly setting the eta member of the Net object, like this:
myNet.eta = 0.1;
When using the web interface, you can change the eta parameter (and other parameters) in the GUI at any time, even while the network is busy processing input data.
Also see the Parameter List in the wiki.
Are the output neurons binary or floating point?
They are interpreted in whatever manner you train them to be, but you can only train the outputs to take values in the range that the transfer function is capable of producing.
If you're training a net to output binary values, it's best if you use the maxima of the transfer function to represent the two binary values. For example, when using the default tanh() transfer function, train the outputs to be -1 and +1 for false and true. When using the logistic transfer function, train the outputs to be 0 and 1.
How do I use a different transfer function?
You can add a "tf" parameter to any layer definition line in the topology config file. The argument to tf can be "tanh", "logistic", "linear", "ramp", "gaussian", or "relu". The transfer function you specify will be used by all the neurons in that layer. See neural2d-core.cpp for more information.
In the topology config file, the tf parameter is specified as in this example:
layerHidden1 size 64x64 from input radius 3x3 tf linear
You can add new transfer functions by following the examples in neural2d-core.cpp. There are two places to change: first find where transferFunctionTanh() is defined and add your new transfer function and its derivative there. Next, locate the constructor for class Neuron and add a new else-if clause there, following the examples.
How do I define a convolution filter?
In the topology config file, any layer defined with a convolve parameter and a list of constant weights will operate as a convolution filter applied to the source layer. The syntax is of the form:
layer2 size 64x64 from input convolve {{1,0,-1},{0,0,0},{-1,0,1}}
How do I define convolution networking and pooling?
In the topology config file, define a layer with an X,Y size and a depth (number of kernels to train), and add a convolve parameter to specify the kernel size. For example, to train 40 kernels of size 7x7 on an input image of 64x64 pixels:
input size 64x64
layerConv size 40*64x64 from input convolve 7x7
. . .
To define a pooling layer, add a pool parameter, followed by the argument "avg" or "max," followed by the operator size, e.g.:
layerConv size 10*32x32 ...
layerPool size 10*8x8 from layerConv pool max 4x4
. . .
How do the color image pixels get converted to floating point for the input layer?
That's in the ReadBMP() function in neural2d-core.cpp. The default version of ReadBMP() converts each RGB pixel to a single floating point value in the range 0.0 to 1.0.
By default, the RGB color pixels are converted to monochrome and normalized to the range 0.0 to 1.0. That can be changed at runtime by setting the colorChannel member of the Net object to R, G, B, or BW prior to calling feedForward(). E.g., to use only the green color channel of the images, use:
myNet.colorChannel = NNet::G;
The color conversion can also be specified in the topology config file on the line that defines the input layer by setting the "channel" parameter to R, G, B, or BW, e.g.:
input size 64x64 channel G
How can I use .jpg and .png images as inputs to the net?
Currently only .bmp images are supported. This is because the uncompressed BMP format is so simple that we can use simple, standard C/C++ to read the image data without any dependencies on third-party image libraries. To add an adapter for other image formats, follow the example of the ReadBMP() function and write a new adapter such as ReadJPG(), ReadPNG(), etc., using your favorite image library, then replace the call to ReadBMP() with your new function.
Why does the net error rate stay high? Why doesn't my net learn?
Neural nets are finicky. Try different network topologies. Try starting with a larger eta values and reduce it incrementally. It could also be due to redundancy in the input data, or mislabeled target output values. Or you may need more training samples.
What other parameters do I need to know about?
Check out the list of parameters in the wiki.
The neural2d program and its documentation are copyrighted and licensed under the terms of the MIT license.
The set of digits images in the images/digits/ subdirectory is released to the public domain.