Neural Network Help

SkyAphid · November 7, 2012, 2:33am

Anyone here good at neural nets? I’ve been studying some of the algorithms behind it, but seeing as I haven’t taken Calculus yet I’ve had to pick a lot of the concepts for this on the way through.

Anyway, I generate and put weights into the net, it provides an output in which I send to be backpropagated so that the weights can be corrected for maximum correctness.

Unfortunately, all it achieves is making the results worse…

Weights are generated, and 1 and 0 are inputted for an ideal output of 1. This is based on the XOR system for testing networks.

Here’s the code:

public class ANN {
	private static final int HIDDEN_WEIGHT = 0;
	private static final int HIDDEN_OLD_WEIGHT = 1;
	private static final int HIDDEN_SUM = 2;
	private static final int HIDDEN_OUT = 3;
	
	private static final int OUTPUT_SUM = 0;
	private static final int OUTPUT_OUT = 1;
	
	float[][] weights;
	float[][] weightHistory;
	
	float[][] hlayers;
	
	int neurons; //Rows
	int hiddenLayers; //Columns
	float idealOutput;
	
	float learningRate = 0.8f;
	float momentum = 0.6f;
	
	public ANN(int neurons, int hiddenLayers, float idealOutput){
		this.neurons = neurons;
		this.hiddenLayers = hiddenLayers;
		this.idealOutput = idealOutput;

		weights = new float[neurons+1][hiddenLayers];
		weightHistory = new float[neurons][hiddenLayers];
		
		hlayers = new float[hiddenLayers+1][4];
		
		//Randomize weights, can be overwritten
		Random r = new Random();
		
		for (int i = 0; i < neurons+1; i++){
			for (int j = 0; j < hiddenLayers; j++){
				weights[i][j] = r.nextFloat();
				hlayers[j][HIDDEN_WEIGHT] = r.nextFloat();
			}
		}
		
		hlayers[hiddenLayers][HIDDEN_WEIGHT] = r.nextFloat();
	}
	
	public void overwriteWeights(float[][] newWeights, float[] hweights){
		weights = newWeights;
		
		for (int a = 0; a < hlayers.length; a++){
			hlayers[a][HIDDEN_WEIGHT] = hweights[a];
		}
	}
	
	public float[] recall(float[] input){
		float output[] = new float[2];
		
		for (int j = 0; j < hiddenLayers; j++){
			float sum = 0f;
			
			//Apply weights to inputs
			for (int i = 0; i < neurons; i++){
				sum += weights[i][j] * input[i];
			}
			
			//Add bias
			sum += weights[neurons][j];
			
			//Send to hidden layer and apply sigmoid
			hlayers[j][HIDDEN_SUM] = sum;
			hlayers[j][HIDDEN_OUT] = sigmoidActivation(sum);
		}
		
		//Apply weights to outputs
		for (int j = 0; j < hiddenLayers; j++){
			output[OUTPUT_SUM] += hlayers[j][HIDDEN_OUT] * hlayers[j][HIDDEN_WEIGHT];
		}
		
		//Apply bias 2 and apply sigmoid
		output[OUTPUT_SUM] += hlayers[hiddenLayers][OUTPUT_SUM];
		output[OUTPUT_OUT] = sigmoidActivation(output[0]);
		
		System.out.println("SUM: "+output[OUTPUT_SUM] + " OUTPUT: "+output[OUTPUT_OUT]);
		backPropagate(output, input);
		
		return output;
	}
	
	public void backPropagate(float[] output, float[] input){
		float error = getError(output[OUTPUT_OUT], idealOutput);
		float outDelta = getLayerDelta(output[0], error);

		float[] gradient = new float[hiddenLayers];
		
		System.out.println("Error is at "+error);
		
		//Now we back propagate to the output
		for (int j = 0; j < hiddenLayers; j++){
			//float hiddenDelta = (sigmoidDerivative(hlayers[i][HIDDEN_SUM]) * hlayers[i][HIDDEN_WEIGHT]) * outDelta;
			
			gradient[j] = outDelta * hlayers[j][HIDDEN_OUT];
			
			float newWeight = (learningRate * gradient[j]) + (hlayers[j][HIDDEN_WEIGHT] * hlayers[j][HIDDEN_OLD_WEIGHT]);
			
			hlayers[j][HIDDEN_OLD_WEIGHT] = hlayers[j][HIDDEN_WEIGHT];
			hlayers[j][HIDDEN_WEIGHT] = newWeight;
		}  
		
		for (int j = 0; j < hiddenLayers; j++){
			for (int i = 0; i < neurons; i++){
				 float newWeight = (learningRate * gradient[j]) + (weights[i][j] * weightHistory[i][j]);
				 weightHistory[i][j] = weights[i][j];
				 weights[i][j] = newWeight;
			}
		}
	}

	public float getError(float output, float idealOutput){
		return (float) output - idealOutput;
	}
	
	public float getLayerDelta(float sum, float error){
		return -error * sigmoidDerivative(sum);
	}
	
	public float sigmoidActivation(float x){
		return 1f / (float) (1f + Math.exp(-x));
	}
	
	public float sigmoidDerivative(float sum){
		return (sigmoidActivation(sum) * (1f - sigmoidActivation(sum)));
	}
}

Here’s the output when tested:



Commencing test...

Initializing network.

Overwriting weights for control variable.

Recalling [1.0, 0.0]

Test 0
SUM: 1.1265055 OUTPUT: 0.7551935
Error is at -0.24480653
Test 1
SUM: 0.7995221 OUTPUT: 0.68987226
Error is at -0.31012774
Test 2
SUM: 0.8105837 OUTPUT: 0.6922339
Error is at -0.30776608
Test 3
SUM: 0.80390143 OUTPUT: 0.6908084
Error is at -0.30919158
Test 4
SUM: 0.8040238 OUTPUT: 0.6908346
Error is at -0.30916542
Test 5
SUM: 0.803821 OUTPUT: 0.69079125
Error is at -0.30920875
Test 6
SUM: 0.80381775 OUTPUT: 0.69079053
Error is at -0.30920947
Test 7
SUM: 0.8038115 OUTPUT: 0.6907892
Error is at -0.30921078
Test 8
SUM: 0.8038112 OUTPUT: 0.6907891
Error is at -0.3092109
Test 9
SUM: 0.80381095 OUTPUT: 0.6907891
Error is at -0.3092109
Test 10
SUM: 0.80381095 OUTPUT: 0.6907891
Error is at -0.3092109
Test 11
SUM: 0.80381095 OUTPUT: 0.6907891
Error is at -0.3092109
Test 12
SUM: 0.80381095 OUTPUT: 0.6907891
Error is at -0.3092109
Test 13
SUM: 0.80381095 OUTPUT: 0.6907891
Error is at -0.3092109
Test 14
SUM: 0.80381095 OUTPUT: 0.6907891
Error is at -0.3092109
Test 15
SUM: 0.80381095 OUTPUT: 0.6907891
Error is at -0.3092109
Test 16
SUM: 0.80381095 OUTPUT: 0.6907891
Error is at -0.3092109
Test 17
SUM: 0.80381095 OUTPUT: 0.6907891
Error is at -0.3092109
Test 18
SUM: 0.80381095 OUTPUT: 0.6907891
Error is at -0.3092109
Test 19
SUM: 0.80381095 OUTPUT: 0.6907891
Error is at -0.3092109

The error is supposed to be negative in some cases, but I’m not so sure about if I did it right.

Please point out flaws, there’s bound to be a bunch because of how mathematically retarded I am when it comes to calculus.

Also, few questions,

Can Neural Networks only learn one pattern for each network?
What’s the point of having more than one set of hidden layer nodes?
Where’s a good place to learn some basic Calculus fundamentals?

Thank you very much.

ReBirth · November 7, 2012, 3:59am

This is one of my subjects in study and fortunately my strongest one. ANN is kinda general, what are you planning, back propagation? kohonen?

to your questions, IMO

yes it can
the more hidden layers you have, the adaptive/learning skill will be better. For example if you use it to recognize pattern, it can spot minor details. It can also reduce error margin between literation.
college nobody want to read calculus books at home *.

*) applied to common people, especially non gamer ones.

SkyAphid · November 8, 2012, 12:47am

For one, I’m glad to have someone who’s experienced in this, because I need a lot of help! Hahah.

Anyway, I need it to recognize mostly photos, as I’m working on an adapative AI. Essentially it will be trained to recognize people and things, along with text. Even person/thing will be marked with a good or bad meter so it knows how to react to certain stimuli and so on. I’ve chosen to give it “eyes” because I plan on welding some parts and making a nifty little robot arm or something for fun over the summer.

But yeah, I need it to recognize places and things. Problem is, I have no real idea on how to, and a lot of the examples are written so mathematically I struggle to comprehend a lot of it. I wanted to use backpropagation because it seemed best for allowing it to learn by itself in some cases.

theagentd · November 8, 2012, 1:17am

Huh? I did calculus in my second and third year in high school…

ReBirth · November 8, 2012, 4:18am

@theagentd
Homework doesn’t count

Back propagation is best use on prediction or data mining. For pattern like you saud, you may need hopfield. Actually, rather make one by yourself there’s already Neuroph library which quite powerful and you’ll have a working network by less than 20 lines of code

SkyAphid · November 9, 2012, 1:25am

It’s a lot cooler to do it yourself. I thought hopfields were slow learners?

ReBirth · November 9, 2012, 3:14am

Yes, Hopfield needs more learning and training data.

SkyAphid · November 11, 2012, 10:52pm

Sorry to reply so late, I’ve been busy.

Anyway, I’ve read from a lot of places that Hopfields is a bad technique due to the fact it can’t fix errors. Is that true? Plus, the bigger ones apparently get pretty slow.

ReBirth · November 12, 2012, 12:34am

I never read about that, but considering Hopfield who uses matrix as its “memory” it maybe true.

Atually same problem goes for neural network too, but you can do quick fix by adjusting the number of layers and each of its neurons.

SkyAphid · November 12, 2012, 3:14am

Alright. Thanks. Any specific reads you recommend? Possibly stuff I get can get off the internet, because I’m an impatient person and I don’t want to order a book. Hahah

ReBirth · November 12, 2012, 3:19am

Unfortunately I got them from books and college chairs

SkyAphid · November 12, 2012, 3:44am

You wouldn’t happen to be interested in PM’ing me some of your notes would you? lol

ReBirth · November 12, 2012, 4:02am

I have no problem in PM’ing/copying, the problem is translating. They’re not written in English ;D

SkyAphid · November 14, 2012, 12:32am

If you could translate the important stuff and send them to me I’d be thankful. If you write in a language with a relatively similar alphabet to english I can translate them myself though. I’d be quite thankful!

ReBirth · November 14, 2012, 1:56am

I can’t promise anything. Keep searching for another source though ;D

Jono · November 14, 2012, 8:02am

One thing is that it looks like the number of nodes in your hidden layer(s) are the same as in your input layer. Two nodes probably won’t be enough to represent the XOR function - try 3 or more.

Also, I’ve never heard of any value in more than two hidden layers and I’m pretty sure that theoretically two is sufficient for any mapping from inputs to outputs (though the layers might have to be large in some cases).

krasse · November 14, 2012, 9:44am

ANN is a specialized case of non-linear optimization, which is a very tricky area, filled with black art tricks.

Also, there is often a better alternative available for the problem than NNs, if you can figure out good features (there are some really good features for images that you can use) and precalculate them and use as much linear optimization as possible etc.

Here is also a good resource:

Joshua_Waring · November 14, 2012, 10:38am

Are you doing the Neural Network course at www.coursera.com ?