Java Brute Forcing Algorithm Optimization.

This post is mainly relative to Algorithms and Optimization of Java.
Please don’t comment with negative criticizing statements regarding the title, this is for my own education XD.
There’s so much more than my full implementation to even make this thing usable.

To begin, I got bored and decided to make one ::slight_smile:
I’ve been working on it for about 2 days so far.
Not to brag but rather to give some details, it can currently handle up to 11 slots. (Numbers, Uppercase/Lowercase mixes)

Version 0.4: http://i47.tinypic.com/16gao88.png
Version 0.5:

At the moment the algorithm is like so:

Before we begin:
[Pseudo Code Definition]

[A Quote I heard]

Pseudo Code:


   /*
	 * Long method >...<
	 */
	private boolean brute_force_for(TargetPassword target, int target_slots, String target_query_types[]) {
		String password = (String) targetPassword.targetObject;
		switch (target_slots) {
		/* 1 SLOT FOR PASSWORD */
		case 1:
			for (a = 0; a < target_query_types.length; a++) {
				COMPUTATED_DATA[a] = appendLetter(a);
				if (password.equals(COMPUTATED_DATA[a])) {
					return pass_cracked(password);
				}
				debug(COMPUTATED_DATA[a]);
			}
			break;
		/* 2 SLOTS FOR PASSWORD */
		case 2:
			for (a = 0; a < target_query_types.length; a++)
				for (ab = 0; ab < target_query_types.length; ab++) {
					COMPUTATED_DATA[a] = (new StringBuilder(String.valueOf(appendLetter(a)))).append(appendLetter(ab)).toString();
					if (password.equals(COMPUTATED_DATA[a])) {
						return pass_cracked(password);
					}
					debug(COMPUTATED_DATA[a]);
				}
			break;
		}
	}

Code Notes:
The for loops will continue for a total of 11 switch statements, i removed these to keep the code short.
But as you can imagine it gets lengthy and bulky while processing hundreds of thousands / millions of combinations per second.

What my main questions are:
How can I optimize the method seen above.
Should I use StringBuilders or not?
Can I wrap my for loops into a more optimized / less repetitive way?
I’m using String[] arrays for my allowed characters constants, is this efficient?
I’m using a Switch() statement, is this efficient?
Is anything you see unnecessary / not where it should be?
Have one of these been built ‘Virus Free’ using Java?
Is it possible for this to be processed on a GPU for faster computations?

Closing Statement:
Hope somebody can help me, thanks.
If a staff member finds this post ‘malicious’ please note there’s so much more needed for this code to work due to it’s state of being ‘pseudo’.
Thanks guys, looking forward to any feedback.

SEE here: http://www.java-gaming.org/topics/orders-of-magnitude/27917/view.html

Thanks Roquen.
Totally bizarre, some passwords can take up to 1.4+ Million years to brute force LOL.

If you’re good with optimization or you just want to see a rather long switch statement. (Basis of this post)
Seeing this should give you a idea for the type of optimizations I need ;D

Here it is:
EDIT: Updated the code, changed it from the ‘if, else’ statement to the ‘switch’ statement (Still the same algorithm).
[Didn’t want to spam the opening post]


		/* 11 SLOTS FOR PASSWORD */
		case 11:
			for (a = 0; a < target_query_types.length; a++)
				for (ab = 0; ab < target_query_types.length; ab++)
					for (abc = 0; abc < target_query_types.length; abc++)
						for (abcd = 0; abcd < target_query_types.length; abcd++)
							for (abcde = 0; abcde < target_query_types.length; abcde++)
								for (abcdef = 0; abcdef < target_query_types.length; abcdef++)
									for (abcdefg = 0; abcdefg < target_query_types.length; abcdefg++)
										for (abcdefgh = 0; abcdefgh < target_query_types.length; abcdefgh++)
											for (abcdefghi = 0; abcdefghi < target_query_types.length; abcdefghi++)
												for (abcdefghijk = 0; abcdefghijk < target_query_types.length; abcdefghijk++) {
													// TODO: Benchmark Between:
													// (var += "ab") VS. ((StringBuilder)var.append("ab"))
													COMPUTATED_DATA[a] = (new StringBuilder(
															String.valueOf(appendLetter(a))))
															.append(appendLetter(ab))
															.append(appendLetter(abc))
															.append(appendLetter(abcd))
															.append(appendLetter(abcde))
															.append(appendLetter(abcdef))
															.append(appendLetter(abcdefg))
															.append(appendLetter(abcdefgh))
															.append(appendLetter(abcdefghi))
															.append(appendLetter(abcdefghij))
															.append(appendLetter(abcdefghijk))
															.toString();
													if (password
															.equals(COMPUTATED_DATA[a])) {
														return pass_cracked(password);
													}
													debug(COMPUTATED_DATA[a]);
												}
			break;

34 Lines later
http://www.420magazine.com/forums/images/smilies/smokin2.gif

Hi there.

I have made a fully recursive version of this in C.
http://pastebin.java-gaming.org/652ff3f5941

It shouldn’t be to hard to translate to Java. (Thats why I did it for you :D)
Java:
http://pastebin.java-gaming.org/52fff495143

EDIT:
Actully doing some tests i found out that Java version is faster on longer passwords(strings).
While C is faster at smaller strings like 1-3 chars…

for example in one test:
In C version, “breakm”, took ~320seconds,
while in Java version it took ~69seconds.

Damn, right on mate.
Thanks for the feedback, I’ll look into the links you provided XD
You already translated it, right on.

Today I’ll be doing major benchmarking on my version of the brute forcer :slight_smile:
I’ll benchmark yours VS mine also to see if recursion is the transformation I need.

Yay, I guess I’m doing ok!
Brute forced the work ‘breakm’ in 6 seconds :point:
(With the algorithm seen in the opening post)

IGNORE THIS: Because I changed my {code}{/code} to {quote}{/quote}, I have to put this text here to compensate for the ‘over 90% of my text = a quote’ denial.

DISREGARD THE ERROR RELATING TO COMPUTATION CALCULATION BELOW [Calculations are correct]

My total computations calculation seems to be a little bit off. (13 Million off sometimes)
I was wondering…
How would I go about calculating the total possible computations / combinations?
Each slot can be up to 26 different letters so here’s may attempt:


protected int getEstimatedComputations(final int targets_total_slots) {
	// 26 is = to length of the alphabet[] array.
	// a = 1 slot = 26 computations.
	// aa = 2 slots = (26 * 26) computations.
	// aaa = 3 slots = (26 * 26 * 26) computations.
	int computations = 1;
	// for each slot, the computations is *= 26 as seen above.
	for (int i = 0; i < targets_total_slots; i++) {
		computations *= 26;
	}
	return computations;
}

Does that look correct?

I would expect upper and lower case characters as well as underscores, exclamation marks and so on, so depending on password rules, so the combinations are 60 raised to the power of the password length - more in fact as the passwords typicaly can be different lengths.

DISREGARD THE ERROR RELATING TO COMPUTATION CALCULATION BELOW [Calculations are correct]

Well, I’ve got multiple arrays for that type of stuff XD.
(Sorry to turn this into a code snippet thread)

I’m currently trying to achieve calculating the total computations/combinations for the 26 letter (lowercase alphabet) array[] first.
After I can calculate that properly I’ll create / adapt methods to calculate the total based off of the target_array[]s length. (See code below)

Here’s what I have at the moment, hoping these arrays will prove useful for someone who’s not wanting to hand write them.
I googled to find my first premade ‘Alphabet array’ than made the other arrays based off of it.


	public static final String[] TARGET_ALL_NUMBERS = { "0", "1", "2", "3", "4", "5", "6", "7", "8", "9" };
	public static final String[] TARGET_ALL_LC_LETTERS = { "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z" };
	public static final String[] TARGET_ALL_UC_LETTERS = { "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z" };
	public static final String[] TARGET_ALL_LC_LETTERS_AND_NUMBERS = { "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z" };
	public static final String[] TARGET_ALL_UC_LETTERS_AND_NUMBERS = { "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z" };
	public static final String[] TARGET_ALL_UPPER_AND_LOWER_CASE_LETTERS = { "A", "a", "B", "b", "C", "c", "D", "d", "E", "e", "F", "f", "G", "g", "H", "h", "I", "i", "J", "j", "K", "k", "L", "l", "M", "m", "N", "n", "O", "o", "P", "p", "Q", "q", "R", "r", "S", "s", "T", "t", "U", "u", "V", "v", "W", "w", "X", "x", "Y", "y", "Z", "z" };
	public static final String[] TARGET_ALL_UC_LC_AND_NUMBERS = { "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "A", "a", "B", "b", "C", "c", "D", "d", "E", "e", "F", "f", "G", "g", "H", "h", "I", "i", "J", "j", "K", "k", "L", "l", "M", "m", "N", "n", "O", "o", "P", "p", "Q", "q", "R", "r", "S", "s", "T", "t", "U", "u", "V", "v", "W", "w", "X", "x", "Y", "y", "Z", "z" };


But, I meant as far as my computations calculation is it correct?

It doesn’t seem to be correct according to: calc.opensecurityresearch.com
It’s saying a 6 letter password only using lowercase letters could take 321,272,406 computations.
Where as my estimated total (calculate with the method seen 2 posts above) said: 308,915,776.

for each slot(computations *= target_array.length);

I’m not sure if this is helpful or not to you but here are the 10000 most common passwords beside a frequency of how often they are used. http://pastebin.com/F2yhB4vv

That’s great man thanks, I only had commons of 2012 XD
I need to figure out how to literally ‘Load A Dictionary’ efficiently.
Don’t want to wait 30+ minutes to read the .txt file >…<

Off topic to the post but, any tips for how I should handle my dictionary?
Dictionary.txt file composed of one word per line.

Should I shoot for file shrinking?
Definitely be upgrading to a byte_buffer.
Any feedback would be great thanks.

EDIT:
Looks like I’ll be adding a ‘common frequency’ slider/meter for loading common passwords XD
I thought it said 1,000 commons! 10,000 damn [appreciated].
And it’s ordered by high to low frequency… Me likey LOL

What on Earth are you actually trying to do?

You build a string, store it in an array for no obvious reason, and then compare it to another string. What does this accomplish?

Normally when people talk about brute forcing a password, what they’re actually doing is enumerating passwords by brute force, hashing them, and comparing the hash with a target value. The most efficient way of approaching it then depends on the structure of the hash, although if it has a block length of more than the passwords you’re processing then there’s unlikely to be much benefit in caching intermediate results. If there’s no benefit in caching results and you’re not using a statistical model of most likely passwords then the easiest way to enumerate them is to count in a suitable base.

Hey thanks, there’s no need to store the computation if it isn’t going to return true with the computation needed, discard it and keep moving ;D ;D

I don’t know if you’ve programmed one of these before lol.
If you haven’t programmed one without recursion (I have no idea how I thought about programming this, just figured id go aaa, aab, aac and hey it worked!) you’ll find yourself doing it like I’m doing at first. (No salt / no hash)
Enter a password, text your algorithm.

If I wanted to implement hashing and salting etc I could, I’m actually deciding to do something with this project so It’s version 0.4 at the moment.

Developing Version 0.5 right now:
http://oi45.tinypic.com/2e68twk.jpg


(Major performance fixes, using custom swing workers and cache mapping, major database updates (lists etc))
What on earth am I trying to do
http://www.420magazine.com/forums/images/smilies/smokin2.gif
?

lol, my version turned out around 23 times faster than the guy who posted his version of it in java with recursion.
Everywhere I read, recursions usually the first and best option.
I’m sorry, half the time when I’m writing code I don’t understand what I’m writing, there’s a complex chunk of machine code that needs to be written XD and I actually figured this one out so… I’m happy.

Maybe the code’s just not comprehend-able due to the unnecessary object creation / checking etc.
Sorry for the long reply :stuck_out_tongue:

Here’s the output if you want to see it:
(Brute forced the word ‘test’)
http://filebin.ca/ZM7u3fGQzZA/dump.txt (1.92MB Pure text)

Recursion is easy, but it’s very rarely “fast” except in some very narrow cases using a functional language which directly supports transformations of recursion (transparently and under-the-hood).

I’ve written enumerations of various combinatorial structures. One useful question to ask yourself is: can I define a canonical order such that there’s a straightforward next() method? In this case, the canonical order is fairly straightforward, so the question is: given "prabkqdtxczz" how would I create the next string, "prabkqdtxdaa"?

I really don’t understand what your goal is with this, but i have some tips. If you really mean the “Brute” in the “Brute force”, then stop using the StringBuffer, and switch to raw byte[]-s instead. Also stop creating a “new” object in the middle of the loop. You could also unroll those loops to a single loop, and increase the indices in the byte[] yourself.

Ahh never mind, this makes no sense. You are probably just trying to make some benchmark that is heavy on loops and memory.

Well, I ran across a C# implementation of brute forcing with ‘recursion’.
Ran some benchmarking before I went to bed and the results came back: recursion IS faster than no recursion.

Console output between recursion method and no recursion:

Recursion C# implementation translated to Java:
(Word ‘zoosk’ takes 1,420ms to 2,000ms)


protected final void recursion(String pwd, int pos) {
	if (pos < passLength) {
		for (char ch : charTable) recursion(pwd + ch, pos + 1);
	}
}

No Recursion C# implementation translated to Java:
(Word ‘zoosk’ takes 8,400ms to 12,000ms)


protected final void noRecursion(int possible) {
	int done = 0;
	for(int i = 0; i < possible; i++){
		String theWord = "";
		int val = i;
		for(int j = 0; j < passLength; j++){
			int ch= val % chars.length;
			theWord = chars[ch] + theWord;
			val = val / chars.length;
		}
		done++;
	}

My implementation of this which was literally ‘thought out line by line’ is just a few hundred milliseconds slower than the C# recursion.
Mind you mine involves no recursion. :stuck_out_tongue:

I found the C# implementations for brute forcing on some college PDF.
I’ll be doing some more benchmarking on the C# recursion implementation translated to java Vs. my current implementation of long for loops today :slight_smile:

I was stunned to find that PDF!
Than see that my 264 lines of code (Processes any password up to 11 slots) could be wrapped into 6 lines of code with something called ‘recursion’, and it handles unlimited password lengths LOL

I’ve never really messed with recursion before :stuck_out_tongue:

@VeaR:
Thanks for the feedback mate, this program was literally thought up overnight.
If you can find a version like mine where the recursion is literally hand programmed and it works, let me know XD
Regarding unnecessary object creation and transferring to byte[]-s instead of using a String Buffer thanks.
That’s the type of feedback I was hoping to get :slight_smile:

@pjt33:

I explained the algorithm mate:
#1: Start with: prabkqdtxczz
#2: Next up is: prabkqdtxdaa
#3: After that is: prabkqdtxdab
#4: After that is: prabkqdtxdac
etc.

Here’s a somewhat in depth explanation:
On #1 it will detect the z in the last slot.
It will treat that z as the last slot in the alphabet array.
If it is the last slot in the alphabet[] array (which it is) it will -1 in the password index (move back one slot), change the z to the beginning element of the alphabet[] array.
Next it will detect the z 1 slot before the slot we just changed from z to a.
Since the second to last slot we’re dealing with here is also a z… basically repeat the process we performed for the last z.
(Change the z to a, -1 in the password index, process it from a-z)

Hope it makes sense mate, it’s a algorithm.
Kind of hard to explain them :confused:

Program one, or run one of the C# implementations i translated to Java and you will see how it comes up with its computations.
(You’ll have to implement printing out the current computation as it’s not added in the examples above)

Here’s the first version of this I came up with overnight, this one involves no StringBuilders:
http://pastebin.com/Mm2LLUn4 (V0.1)

So you go through each password possibility one by one? In Python I programmed something like this that went through each possibility one by one and it seemed like it would have taken days to finish, even though it was just a lot of for loops. Is Java really that much faster than Python or is there some serious optimization going on?