map performance

Why is it bad on your view? It is a choice, it has a meaning, we don’t use this license to bother people, I use it because it respects my political convictions. Sorry for this short off-topic remark.

Thanks for the additional information delt0r. I would love to see what you have, even if it isn’t perfect yet.

gouessej, because I write software to pay the bills in addition to doing it as a hobby. Using a GPL’ed library would mean open sourcing my whole app, which is not something I want to do for my commercial software.

This is a great discussion, so many good techniques. Thanks Nate for doing such an awesome job of benchmarking all this stuff.

In the map I’m using for the A* algorithm I make the key and object the same object so I should probably make a map where there’s only one array of keys and no entry objects (unless I’m missing something obvious :stuck_out_tongue: :persecutioncomplex: ). I’ll give it a go and post the code once I get it working

CommanderKeith, you might post your code and maybe we can come up with something that specifically fits it.

Reading about cuckoo hashing I came across bloom filters, pretty neat stuff:

So with all the warnings I previously gave, here is the src:
http://www.cibiv.univie.ac.at/~greg/CuckooHashMapSrc.jar
I place this in the public domain etc, and without warranty etc.

These class do not conform to the general contract for Map. In particular values(), keySet() and entrySet() return copys of the hash map at that point in time.
Removing elements from these returned Collections will have no effect on the map.

We do not “rehash” values returned from Object.hashCode() has the Collection class do. We require that Objects stored have taken proper measures to ensure than they provide sufficient randomness.

Please treat this code as a work in progress. It is fast for remove and contains and similar to other implementations for put. But as i said, I have never even ran a profiler on it. And probably won’t, at least not on its own. ie i bench mark it in the app as its used in the app.

Cuckoo hashing however has expected worst case O(1) time for retrieval, contains, delete and even add. This is in contrast to chaining and linear probing that provide O(n) worst case performance.

Note that using a load factor of larger than 0.75 will generally result in resized before the limit is reached. Hence the load factor will in practice never be much higher than 0.75 . In fact i think you can prove that it would be between 0.75 and 0.86 or something with 3 hash functions.

Sorry, there was a minor defect in one of the hashs. Thats now fixed. At any rate it does not affect correctness.

Sweet! Thanks delt0r. I’ll be playing with it as soon as I get a chance!

Thanks! my project isn’t really bottlenecked by map gets or puts, but it’s maxing out the garbage collector with too many HashMap.Entry’s so I tried the Caching style maps you and JL345 made and they were perfect. 8)

But delt0r’s cuckoo hashing idea was too cool too refuse since:

  • it doesn’t use entry objects so there’s no garbage at all or any need for object pooling,
  • the key is the object so there’s no double-storage of references.

The only disadvantage for me I guess is that the hash function is never stored so contains(obj) [or get(obj)] might be a tad slower since the hash of the object being put’ed has to be compared to at least 1 other object’s hash code, so at least 2 hashcodes have to be computed.
This is the main modification which speeds up the map in my case:

//int hash =System.identityHashCode(e);
// is replaced by:
int hash =e.hashCode();

I was wondering delt0r, would your implementation be faster if you didn’t calculate hash2 and hash3 until you knew you actually needed them?

Cheers,
Keith

PS: here’s delt0r’s code as I’m using it, with some minor modifications

/*
 * To change this template, choose Tools | Templates
 * and open the template in the editor.
 */

package astarpathfinder.core;

import java.lang.reflect.Array;
import java.util.*;

//import at.mabs.util.IdentityHashSetLinearProbe;

/**
 * Checking to see how fast a 3 way cuckoo hashing is compared to linear probing.
 *
 * The deletes should be *much* faster.
 *
 * Performance is as good or much better than Linear probing except for pathological cases where its about
 * the same (5% faster). It was never slower. Over all the performance is 2-5x faster than java.util.hashSet
 * and sometimes much more (for crazy cases that probably never happen)
 *
 * @author greg ewing
 *
 * I put this into the public domain
 *
 * @param <T>
 */
public class CIdentityHashSet<T> implements Set<T> {
	private T[] table;
	private int size;
	private double loadFactor;
	private int mask =0;
	private int threshold;
	private static int PRIME =2053368389;// random 31 bit prime cus primes are cool

	public CIdentityHashSet(){
		this(1024, 0.75);
	}

	public CIdentityHashSet(int initSize, double loadFactor) {
		this.loadFactor =loadFactor;
		int pow2Size =Integer.highestOneBit(initSize);
		if (pow2Size < initSize)
			pow2Size <<=1;
		table =(T[]) new Object[pow2Size];
		mask =pow2Size - 1;
		threshold =(int) (pow2Size * loadFactor);
	}

	@Override
	public boolean add(T e) {
		if (e == null)
			throw new RuntimeException("Does not permit nulls");

		//int hash =System.identityHashCode(e);
		int hash =e.hashCode();
		int hash2 =1 + ((hash * PRIME) + (hash >>> 16));
		int hash3 =2 + (hash ^ (hash >>> 16)) * PRIME;
		hash =hash & mask;
		hash2 =hash2 & mask;
		hash3 =hash3 & mask;
		//inlining for performance. makes add about 20% faster
		if (table[hash] == e || table[hash2] == e || table[hash3] == e)
			return false;
		if(table[hash]==null){
			table[hash]=e;
		}else if(table[hash2]==null){
			table[hash2]=e;
		}else if(table[hash3]==null){
			table[hash3]=e;
		}else{
			T pushed=table[hash];
			table[hash]=e;
			//System.out.println("Pushing "+e);
			pushInsert(pushed);
		}
		size++;
		// check size... to avoid cascading rehashes.
		if (size > threshold)
			rehash();
		return true;
	}

	private boolean pushInsert(T e) {
		//int hash =System.identityHashCode(e);
		int hash =e.hashCode();
		int hash2 =1 + ((hash * PRIME) + (hash >>> 16));
		int hash3 =2 + (hash ^ (hash >>> 16)) * PRIME;
		hash =hash & mask;
		hash2 =hash2 & mask;
		hash3 =hash3 & mask;
		while (true) {
			for (int i =0; i < table.length; i++) {
				if (table[hash] == null) {
					table[hash] =e;
					// System.out.println("Iter1:"+i);
					return true;
				}
				if (table[hash2] == null) {
					table[hash2] =e;
					// System.out.println("Iter2:"+i);
					return true;
				}
				if (table[hash3] == null) {
					table[hash3] =e;
					// System.out.println("Iter3:"+i+"\t"+hash+"\t"+hash2+"\t"+hash3);
					return true;
				}
				// didn't add --evict hash element
				T pushed =table[hash];
				table[hash] =e;
				e =pushed;
				int oldHash =hash;
				// now get all the hashes for the new element
				//int hash =System.identityHashCode(e);
			hash =e.hashCode();
				hash2 =1 + ((hash * PRIME) + (hash >>> 16));
				hash3 =2 + (hash ^ (hash >>> 16)) * PRIME;
				hash =hash & mask;
				hash2 =hash2 & mask;
				hash3 =hash3 & mask;
				// check the patalogical case of old=new.
				if (hash == oldHash) {
					hash =hash3;
					//hash3 =oldHash;// swap
				}

			}
			//System.out.println("PushRehash");
			rehash();
			// the last element we still have now has invalid hash codes
			// this should be very rare and so expensive that this hardly matters
			//int hash =System.identityHashCode(e);
			hash =e.hashCode();
			hash2 =1 + ((hash * PRIME) + (hash >>> 16));
			hash3 =2 + (hash ^ (hash >>> 16)) * PRIME;
			hash =hash & mask;
			hash2 =hash2 & mask;
			hash3 =hash3 & mask;
		}
		// return false;

	}

	// must readd all old elements once done...
	private void rehash() {
		T[] oldTable =table;
		int oldSize =size;

		table =(T[]) new Object[table.length * 2];
		size =0;
		mask =table.length - 1;// power of 2 assumed...
		threshold =(int) (table.length * loadFactor);

		for (int i =0; i < oldTable.length; i++) {
			if (oldTable[i] != null)
				add(oldTable[i]);
		}
		assert (size == oldSize);
		//System.out.println("Rehash:" + table.length);
	}

	@Override
	public boolean addAll(Collection<? extends T> c) {
		boolean flag =false;
		for (T e : c) {
			flag |=add(e);
		}
		return flag;
	}

	@Override
	public void clear() {
		size =0;
		Arrays.fill(table, null);
	}

	@Override
	public boolean contains(Object e) {
		if (e == null)
			return false;
		//int hash =System.identityHashCode(e);
		int hash =e.hashCode();
		int hash2 =1 + ((hash * PRIME) + (hash >>> 16));
		int hash3 =2 + (hash ^ (hash >>> 16)) * PRIME;
		hash =hash & mask;
		hash2 =hash2 & mask;
		hash3 =hash3 & mask;

		if (table[hash] == e || table[hash2] == e || table[hash3] == e)
			return true;
		return false;
	}
	public T get(T e){
		if (contains(e)){
			return e;
		}
		return null;
	}
	public T put(T e, Object obj){
		add(e);
		return null;
	}

	@Override
	public boolean containsAll(Collection<?> c) {
		for (Object e : c) {
			if (!contains(e)){
				return false;
			}
		}

		return true;
	}

	@Override
	public boolean isEmpty() {
		return size == 0;
	}

	@Override
	public Iterator<T> iterator() {

		return new Iterator<T>() {
			private int count =0;
			private int size =CIdentityHashSet.this.size;
			private int currentIndex =-1;
			private boolean removed =false;

			@Override
			public boolean hasNext() {

				return count < size;
			}

			@Override
			public T next() {
				if(count==size)
					throw new NoSuchElementException();
				count++;
				currentIndex++;
				while (table[currentIndex] == null) {
					currentIndex++;
				}
				removed =false;
				return table[currentIndex];
			}

			@Override
			public void remove() {
				if (removed)
					throw new NoSuchElementException("Element Already removed");
				// not the most effecent...
				CIdentityHashSet.this.remove(table[currentIndex]);
				size--;
				currentIndex--;
				count--;
				removed =true;
			}

		};
	}

	@Override
	public boolean remove(Object o) {
		if (o == null)
			return false;
		//int hash =System.identityHashCode(e);
		int hash =o.hashCode();
		int hash2 =1 + ((hash * PRIME) + (hash >>> 16));
		int hash3 =2 + (hash ^ (hash >>> 16)) * PRIME;
		hash =hash & mask;
		hash2 =hash2 & mask;
		hash3 =hash3 & mask;

		if (table[hash] == o) {
			table[hash] =null;
			size--;
			return true;
		}
		if (table[hash2] == o) {
			table[hash2] =null;
			size--;
			return true;
		}
		if (table[hash3] == o) {
			table[hash3] =null;
			size--;
			return true;
		}
		return false;
	}

	@Override
	public boolean removeAll(Collection<?> c) {
		boolean flag =false;
		for (Object o : c) {
			flag |=remove(o);
		}
		return flag;
	}

	@Override
	public boolean retainAll(Collection<?> c) {
		boolean flag =false;
		for(int i=0;i<table.length;i++){
			T e=table[i];
			if(e!=null && !c.contains(e)){
				table[i]=null;
				size--;
				flag=true;
			}
		}
		return flag;
	}

	@Override
	public int size() {

		return size;
	}

	@Override
	public <E> E[] toArray(E[] a) {
		if(a.length<size){
			a=(E[])Array.newInstance(a.getClass().getComponentType(),size);
		}
		int index=0;
		for(int i=0;i<table.length;i++){
			if(table[i]!=null)
				a[index++]=(E)table[i];
		}
		return a;
	}

	@Override
	public Object[] toArray() {
		Object[] array=new Object[size];
		int index=0;
		for(int i=0;i<table.length;i++){
			if(table[i]!=null)
				array[index++]=table[i];
		}
		return array;
	}

	public String toString() {
		StringBuilder builder =new StringBuilder();
		builder.append("IdentSet:" + size + "(");
		for (int i =0; i < table.length; i++) {
			if (table[i] != null) {
				builder.append(table[i].toString() + ",");
			}
		}
		builder.append(")");
		return builder.toString();
	}





	public static void main(String[] arg) {

	}

}


There are a few no brainiers to speed up my implementation. Lazy calculation of the hashes is one. Better hash cals is another. Get rid of the multiply for starters. There are other things too.

However you will probably be disappointed with the performance increase as it would probably close to undetectable. However what I do for some of my sets is to have a special object type with 3 final int fields. Since i only need identity, these are filled with random ints at creation time, these are then my 3 hash values and I don’t need to calculate anything.

It does increase performance. But only a little. The best part of cuckoo hashing is worst case O(1) performance.

But good implementations is nothing compared to just being smarter with how you use them in the first place.

That’s pretty clever. Thanks for the code, it’s smoothed out my A* path-finder so there’s no GC pauses and it’s super fast which is fantastic 8)

Ok guys so if I want best performance on android what should I use? CuckoFastMap that was posted above here:
http://www.java-gaming.org/index.php/topic,21395.msg174900.html#msg174900
?

The only way to really know is to test your specific code. How you use the map will affect performance and it will vary on different hardware. If you aren’t having a performance problem, HashMap is fine (don’t optimize prematurely). If you are having a performance problem, profiling or benchmarking is a must to really understand what is going on.

That said, if you have int keys, chances are FastIntMap will work well for you. If you have GC problems, CachingFastMap is a good bet.

I meant to do more with the cuckoo map but I got distracted and wandered off. I’ll come back to it eventually!

The cuckoo hashing has worst case O(1) performance, so in theory is as good or better than anything else. BUT the implementation in this thread is a work in progress. So if you used it you would need to expect bugs to be in the implementation. It should be noted I am using a version in production code. But its something to think about.

Finally updated with delt0r’s cuckoo map.

Edit: updated charts again, this time with server VM, 5000 for warmpup, and 5000 for runs.

http://chart.apis.google.com/chart?chtt=Get, Desktop&chs=700x345&chd=t:345,691,691,691,691,691,691,691,1037,1037,1037,1037,1383,1383,1729,3112,3112,3804&chds=0,3804&chxl=0:|Hashtable, Java|THashMap, Trove|FastHashMap, Commons Collections|OpenIntObjectHashMap, Cern Cole|HashMapV2, Doug Lea|FastHashMap, alex14n|ApacheIntHashMap, Commons Lang|TIntObjectHashMap, Trove|IdentityHashMap, Java|HashedMap, Commons Collections|CHashMap (cuckoo), delt0r|HashMap, Java|CachingHashMap, JL235|IdentityMap, Commons Collections|FastMap, Nate|FastIdentityMap, Nate|CachingFastMap, Nate|FastIntMap, Nate&cht=bhg&chbh=10&chxt=y&chco=660000|660033|660066|660099|6600CC|6600FF|663300|663333|663366|663399|6633CC|6633FF|666600|666633|666666

http://chart.apis.google.com/chart?chtt=Put, Desktop&chs=700x345&chd=t:3113,3804,3804,3805,4150,4496,4842,4842,5188,5534,5534,5879,6571,6571,6918,7263,8301,11068&chds=0,11068&chxl=0:|OpenIntObjectHashMap, Cern Cole|FastHashMap, Commons Collections|Hashtable, Java|HashMapV2, Doug Lea|HashedMap, Commons Collections|FastHashMap, alex14n|HashMap, Java|THashMap, Trove|FastMap, Nate|CHashMap (cuckoo), delt0r|CachingFastMap, Nate|IdentityHashMap, Java|IdentityMap, Commons Collections|FastIdentityMap, Nate|ApacheIntHashMap, Commons Lang|TIntObjectHashMap, Trove|FastIntMap, Nate|CachingHashMap, JL235&cht=bhg&chbh=10&chxt=y&chco=660000|660033|660066|660099|6600CC|6600FF|663300|663333|663366|663399|6633CC|6633FF|666600|666633|666666

http://chart.apis.google.com/chart?chtt=Total, Desktop&chs=700x345&chd=t:3804,4149,4841,4841,4842,5187,5533,5879,5879,6225,6570,7608,7954,8301,8646,11067,11413,12797&chds=0,12797&chxl=0:|OpenIntObjectHashMap, Cern Cole|FastHashMap, Commons Collections|Hashtable, Java|THashMap, Trove|HashMapV2, Doug Lea|FastHashMap, alex14n|HashedMap, Commons Collections|HashMap, Java|FastMap, Nate|CHashMap (cuckoo), delt0r|IdentityHashMap, Java|CachingFastMap, Nate|IdentityMap, Commons Collections|ApacheIntHashMap, Commons Lang|FastIdentityMap, Nate|TIntObjectHashMap, Trove|FastIntMap, Nate|CachingHashMap, JL235&cht=bhg&chbh=10&chxt=y&chco=660000|660033|660066|660099|6600CC|6600FF|663300|663333|663366|663399|6633CC|6633FF|666600|666633|666666

Hey Nate,

I converted FastIntMap to FastLongMap for LWJGL’s OpenCL package, then I wrote a small benchmark to compare it with Java’s HashMap. For put(), I’m seeing a small performance gain (up to 10%) on JDK6-server and JDK7 VMs by extracting the rehashing code in its own method. Could be something specific to my benchmark/machine, but you may want to try it out.

I just tried your benchmark. With count = 10000 (the default 100 runs everything in interpreted mode) and a slightly changed test loop:

for (Test runnable : tests) {
    warmup = true;
    for (int i = 0; i < 5; i++)
        runnable.run();
    warmup = false;
    for (int i = 0; i < 5; i++)
        runnable.run();
    System.out.println();
}

I get these results on JDK7 b111:

http://chart.apis.google.com/chart?chtt=Get, Desktop&chs=700x345&chd=t:70825,93864,95144,105384,110504,117330,127143,130983,133970,149756,173649,184315,196262,235087,403190,557640,655771,974909&chds=0,974909&chxl=0:|Hashtable, Java|FastHashMap, Commons Collections|THashMap, Trove|OpenIntObjectHashMap, Cern Cole|TIntObjectHashMap, Trove|ApacheIntHashMap, Commons Lang|FastHashMap, alex14n|HashMapV2, Doug Lea|HashMap, Java|CachingHashMap, JL235|HashedMap, Commons Collections|IdentityMap, Commons Collections|CachingFastMap, Nate|FastMap, Nate|IdentityHashMap, Java|FastIdentityMap, Nate|CHashMap (cuckoo), delt0r|FastIntMap, Nate&cht=bhg&chbh=10&chxt=y&chco=660000|660033|660066|660099|6600CC|6600FF|663300|663333|663366|663399|6633CC|6633FF|666600|666633|666666

http://chart.apis.google.com/chart?chtt=Put, Desktop&chs=700x345&chd=t:246607,308899,318712,344312,374178,378444,396364,436469,459935,480841,517534,619504,672837,677530,805953,1111866,1116133,1375113&chds=0,1375113&chxl=0:|OpenIntObjectHashMap, Cern Cole|FastHashMap, Commons Collections|Hashtable, Java|CHashMap (cuckoo), delt0r|FastHashMap, alex14n|THashMap, Trove|HashMapV2, Doug Lea|HashMap, Java|CachingFastMap, Nate|CachingHashMap, JL235|ApacheIntHashMap, Commons Lang|IdentityMap, Commons Collections|TIntObjectHashMap, Trove|HashedMap, Commons Collections|FastIdentityMap, Nate|IdentityHashMap, Java|FastMap, Nate|FastIntMap, Nate&cht=bhg&chbh=10&chxt=y&chco=660000|660033|660066|660099|6600CC|6600FF|663300|663333|663366|663399|6633CC|6633FF|666600|666633|666666

http://chart.apis.google.com/chart?chtt=Total, Desktop&chs=700x345&chd=t:317432,419403,424096,439456,505161,523507,593905,598171,613531,632731,667290,793153,861845,899817,1230477,1771904,1778303,2086775&chds=0,2086775&chxl=0:|Hashtable, Java|OpenIntObjectHashMap, Cern Cole|FastHashMap, Commons Collections|THashMap, Trove|CHashMap (cuckoo), delt0r|FastHashMap, alex14n|HashMapV2, Doug Lea|HashMap, Java|ApacheIntHashMap, Commons Lang|TIntObjectHashMap, Trove|CachingFastMap, Nate|CachingHashMap, JL235|IdentityMap, Commons Collections|HashedMap, Commons Collections|FastIdentityMap, Nate|IdentityHashMap, Java|FastMap, Nate|FastIntMap, Nate&cht=bhg&chbh=10&chxt=y&chco=660000|660033|660066|660099|6600CC|6600FF|663300|663333|663366|663399|6633CC|6633FF|666600|666633|666666

Thanks Spasi. Why does moving rehash to its own method improve things? Does the smaller method allow it to optimize better? Is there some general rule for this?

I made the rehash changes to FastIntMap, FastMap, and FastIdentityMap, set count=10000, and changed the loop as you indicated. I also tweaked CHashMap slightly to avoid some array lookups in “put” and avoid some hashes in “get” (fast.7z link above to source updated). My results are similar to yours using 1.6.0_16 server VM on my Intel i920 (OC to 3ghz):

http://chart.apis.google.com/chart?chtt=Get, Desktop&chs=700x345&chd=t:81974,86124,88891,90275,104801,117253,118636,122096,124863,133856,144578,145961,148382,152187,166022,192309,246266,274282&chds=0,274282&chxl=0:|Hashtable, Java|FastHashMap, Commons Col
lections|OpenIntObjectHashMap, Cern Cole|THashMap, Trove|TIntObjectHashMap, Trove|FastHashMap, alex14n|CachingFastMap, Nate|HashMapV2, Doug Lea|ApacheIntHashMap, Commons Lang|HashMap, Java|CachingHashMap, JL235|FastMap, Nate|HashedMap, Commons Collections|Iden
tityMap, Commons Collections|IdentityHashMap, Java|FastIntMap, Nate|FastIdentityMap, Nate|CHashMap (cuckoo), delt0r&cht=bhg&chbh=10&chxt=y&chco=660000|660033|660066|660099|6600CC|6600FF|663300|663333|663366|663399|6633CC|6633FF|666600|666633|666666

http://chart.apis.google.com/chart?chtt=Put, Desktop&chs=700x345&chd=t:162910,195076,201994,232085,254222,257681,270477,277395,288118,360061,371128,399490,451372,458982,469012,605635,659938,762664&chds=0,762664&chxl=0:|OpenIntObjectHashMap, Cern Cole|Fast
HashMap, Commons Collections|FastHashMap, alex14n|Hashtable, Java|CHashMap (cuckoo), delt0r|THashMap, Trove|HashMap, Java|CachingFastMap, Nate|IdentityHashMap, Java|IdentityMap, Commons Collections|CachingHashMap, JL235|HashMapV2, Doug Lea|TIntObjectHashMap, T
rove|HashedMap, Commons Collections|ApacheIntHashMap, Commons Lang|FastIdentityMap, Nate|FastMap, Nate|FastIntMap, Nate&cht=bhg&chbh=10&chxt=y&chco=660000|660033|660066|660099|6600CC|6600FF|663300|663333|663366|663399|6633CC|6633FF|666600|666633|666666

http://chart.apis.google.com/chart?chtt=Total, Desktop&chs=700x345&chd=t:251801,288118,313712,365941,371475,392919,399491,409868,415055,450336,517089,524353,540956,617394,743294,754017,906204,954973&chds=0,954973&chxl=0:|OpenIntObjectHashMap, Cern Cole|Fa
stHashMap, Commons Collections|FastHashMap, alex14n|Hashtable, Java|THashMap, Trove|CHashMap (cuckoo), delt0r|HashMap, Java|CachingFastMap, Nate|IdentityHashMap, Java|HashMapV2, Doug Lea|TIntObjectHashMap, Trove|CachingHashMap, JL235|IdentityMap, Commons Colle
ctions|HashedMap, Commons Collections|ApacheIntHashMap, Commons Lang|FastMap, Nate|FastIdentityMap, Nate|FastIntMap, Nate&cht=bhg&chbh=10&chxt=y&chco=660000|660033|660066|660099|6600CC|6600FF|663300|663333|663366|663399|6633CC|6633FF|666600|666633|666666

Nice to see the cuckoo CHashMap “get” times are fast. CHashMap uses object keys while FastIntMap uses int keys, probably be a small boost there.

Results not on the server VM were very erratic.

It should be said again that there are so many variables involved that these benchmark results likely don’t mimic how the map is used in a real app. Nothing beats benchmarking your actual app. The benchmarks also don’t show remove, contains, iteration, garbage generation, etc.

That said, FastIntMap certainly seems consistently fast. The FastMap and FastIdentityMap are the same as FastIntMap just with object keys, so are similarly fast.

[quote=“Nate,post:37,topic:34408”]
I’m not an expert on so low-level stuff, but I used -XX:+PrintAssembly and the shorter method uses registers differently. Larger code size might affect CPU caches negatively too. A general rule could be, if you have a method with half its code not doing anything for 99% of the calls, you might as well extract that half to a different method to help the VM optimize the other half better.

[quote=“Nate,post:37,topic:34408”]
I love JDK7’s tiered compilation :wink:

iirc, small and simple methods get inlined by the JIT compiler automatically thus removing the method call altogether.

[quote]Method inlining increases performance by reducing the number of method calls your program makes. The Java VM inlining code inlines methods that return constants or only access internal fields.

To take advantage of method inlining you can do one of two things. You can either make a method look attractive to the VM to inline or manually inline a method if it doesn’t break your object model. Manual inlining in this context means simply moving the code from a method into the method that is calling it.
[/quote]

I just noticed that Nate’s implementation of cuckoo hashing on the other page is wrong.

For each object you have 2 or more hash functions(i have 3). All objects are at one of index given by hash&mask (assuming power of 2 for table size). This means with 3 hash functions you just check 3 positions.

What is wrong (from my reading) of nate’s code is that if all the slots are full (full collision), he rehashes the table to 2x the size. What you do instead is take one of the objects that are in the way, and check if one of its slots are empty, if so move it. you do this recursively. This will avoid excessive rehashing. You catch the case where you end up with object you started with by rehashing, but i can be proved this is very rare.

It should be noted that with 2 hash functions, the load factor should be 0.5. With 3 about 0.75.

In fact the performance of puts is greatly improved by starting with a very oversize hash map to avoid “cascading puts” and rehashing. Something i often do (say about 2x bigger than i expect it to need).