Back propagation algorithm

When using the backpropagation algorithm for training a neural network, what are good values for learning rate and momentum? Thanks in advance.

-Rob

Impossible to say without knowing more about your network. Other factors such as the number of neurons in each layer are equally important and interdependent with the learning rate.

I suggest to set up an analysis tool to see how the network performs for a range of parameter values, and to determine the optimal learning rate experimentally.

In the simplest case, the ‘analysis tool’ is just a for() loop where you increase the learning rate in each step and log if and how fast the network converges.

A would realy need a backprop algorithm (full) implemented in Java. I hope somebody can help me… thanks
my mailadress: bodoelod@yahoo.com

Its an absolute nightmare trying to train a NN. Really it needs to be done by hand, and you need to monitor the weight changes by eye. You can then bump the network when it gets stuck in a local minima. Luckily there is a pretty good tool that you can do this with, JOONE.

You can think of a standard, feed forward, multi-layer neural net as simply a function approximator.

A high ‘learning rate’ will mean that your NN converges faster. The tradeoff is a larger chance of converging at sub-optimal values within a minima.

A high ‘momentum’ will mean that your NN will have a greater tendency to “climb” out of a minima, therefore it will have a lesser chance of converging on a sub-optimal minima. The tradeoff is that it could also get pushed out of the global minima (which is where you want the solution to converge at).

The tricky thing is to find magic values for both the ‘learning rate’ and ‘momentum’ that gives you the best results for your NN - that usually requires a lot of experimentation and tweaking. I would suggest starting with a small ‘learning rate’ and high ‘momentum’.

When it comes to neural networks, there are no definite answers. Empirical studies
and techniques are frequently used. Replying to your question about momentum and
learning rate, the situation is as described in previous answers. I would suggest
you start with low values for both (in a typical feedforward MLP, 0.3 for learning rate
and 0.2 for momentum are ‘nice’ values to start with). Monitor your network and estimate
its performance over a separate testing set (kept hidden from training) in order to
see what values will work best for your case (tune accordingly).

Moreover there are a couple of other things you can do to combat the local minima problem:

Try using the stochastic approximation to gradient descent which traverses multiple
error surfaces (distinct for each training example) and uses the average in order to converge.

Train multiple neural networks using different parameters (rate, momentum, initial weights) over the same
training set. Evaluate and choose the best performing one.

Use a separate validation set in order to stop training in time and avoid overfitting.

try out: http://www-ra.informatik.uni-tuebingen.de/SNNS/ . It’s a really cool tool to set up and understand NNs.