Better HashMap keys

I use to store much in hashmap since it’s fast and easy to do. But my HashMap keys are really bad. Normally I have something like. String key = x + “.” + y + “.” + z; to make sure every object have it’s own key.I have been thinking of using << and .hashCode() instead. But I have no idea if that’s a good way to do it if I want unique keys.

Example


int locationKey = x << y << z;
int recoloredSpriteKey = img.hashCode() << color.hashCode();

You could create a Point3D class and give it a hashCode method.

Here is the source code for java.awt.geom.Point2D.hashCode, maybe that will help you.

public int hashCode() {
    long bits = java.lang.Double.doubleToLongBits(getX());
    bits ^= java.lang.Double.doubleToLongBits(getY()) * 31;
    return (((int) bits) ^ ((int) (bits >> 32)));
}

Call me crazy…but maybe think about using a hashing function? :stuck_out_tongue:

MurmurHash2 is a reasonable choice for speed vs. quality.

What do you mean “speed vs. quality” what I want is generate a unique number from x, y, z.

You can’t…you can only get close. Read up on hashing.

No you clearly don’t get it, and it’s so simple. I created this topic to get help with my keys in HashMap, since I knew I didn’t need to use Strings as keys, but I didn’t know how to create a good key with numbers, since x + y + z wouldn’t work.

Also, The are no Vector3, Vector3d, or Vector3D in java2D, so I couldn’t possibly know about it. Now when I do i need something that works in java2D

How about making a custom key class, something like a Vector3?

There’s an excellent example of this on StackOverflow:

Just use this as your key, doesn’t get much more elegant.

Trust me. Use MurmurHash2. Or if the number of entries is insured to be small, then use Bernsteins. Call it a day and move on.

DO NOT MAKE UP YOUR OWN. TOSSING FUNCTIONS OF PRIME NUMBERS TOGETHER WILL SUCK.

:persecutioncomplex:

Cute girl wants to do groceries with car.
Cute girl has no car.
Cute girl goes to car dealer to purchase said car.
Car dealer chats about cilinder volume.
Car dealer warns about CO levels in city traffic.
Car dealer strongly suggests to buy a hybrid.
Cute girl points at pink car.
Car dealer BEGINS TO SHOUT MORE WARNINGS.
Cute girl walks off confused.
Cute girl is wondering how to get those groceries home.
Car dealer didn’t make a sale - despite his expertise.

That… That was very clever.

I am curious why the solution I proposed “sucks”. Of course making the class immutable would be (much) better, i.e. see below:

public class Vector3Key {
    private final int x, y, z;

    public void Vertex(int x, int y, int z){
        this.x = x;
        this.y = y;
        this.z = z;
    }

    public boolean equals(Object o){
        if(this == o) return true;
        if(!(o instanceof Vertex)) return false;
        Vertex v = (Vertex)o;
        return x == v.x && y == v.y && z == v.z;
    }

    public int hashCode(){
        //  Use whatever prime numbers you like
        return x ^ y * 137 ^ z * 11317;
    }
}

I can also imagine that the hashCode generated would be sub-optimal with regards to some mathematical properties, e.g. distribution or the number of collisions (had to read up on this). Are these mathematical properties the reason to use a hash library such as Murmur?

Also, I kind of see how the function above might not solve the OP’s problem as I can imagine the positions of objects would change, which may result in undefined behaviour for Java Hash collections.

Actually I would call it good enough, if it doesn’t create a problem for you. You may use a StringBuilder to reduce created temporary Strings when doing String concatenation with +


StringBuilder keyBuilder=new StringBuilder();
keyBuilder.append(x).append('.').append(y).append('.').append(z);
String key=keyBuilder.toString();

If you are sure you are using the StringBuilder from the same thread only, then you can even cache and reuse it when calling setLength(0) before:


static final StringBuilder keyBuilder=new StringBuilder();
(...)
keyBuilder.setLength(0);
keyBuilder.append(x).append('.').append(y).append('.').append(z);
String key=keyBuilder.toString();

Using primitives as keys is a bad idea, because it will lead to a lot of autoboxing while using the hashmap (the java buildin Map implemtations need Objects as keys, so they will convert ints to Integers and back as needed, causing a lot of short lived objects and therefore garbage collection runs)

Other than that, the << operator shifts the left argument bitwise by the value of the right argument wrapping around, so what you listed above probably won’t work, since e.g 64 shifted by 40 would result in the same key like 128 shifted by 39 etc.

So if you want to use this approachm you would need to go for something like


int locationKey = x*MAXIMUM_Y + y*MAXIMUM_Z + z;
int recoloredSpriteKey = img.someId * (256^3) + color.r*(256^2) + color.g*256 + color.b;

As you can see, your img-ids are limited to 256 values now and if you want to get the alpha value into the key, you are screwed anyway (or need to switch to long).

Now here comes, why lukasz1985 don’t get your question. You didn’t explain the use-case for which you need this key and the hashmap. If you want them for managing your objects, then you will probably be better of by just using names to identify them, because using the coordinates would mean the keys will constantly change when you move your object around the screen. If you don’t need the keys for fast retrieval, then there is no point in storing the objects in a HashMap at all…

If you really need to and you want to avoid String keys, then a custom key object like Grunnt posted above would do the trick.

Maybe this statement is out of context, but overriding the built in Object.hashCode() method (along with the equals() method) is exactly what you need to do when creating custom keys…

Actually I would think it doesn’t matter at all, because the amount of objects will be so small in the end that using an array and a brute-force search would be good enough anyway :wink:

@Riven: The thing is: I don’t have any vested interest in anyone listening to me. If someone does great.

Anyone tempted to toss together their own hashing function I suggest you do the following. Do a web search for: survey, non-cryptograhic hash…or something similar…grab a research paper and count the number of functions they talk about. It’ll be a really really small number and ask yourself: With so many programmers and researcher over some many years of computing…why are there so few functions being talked about? There’s a story about Knuth (if you don’t know who that it…please go look it up now…thanks) attempting to create a random number generator and it being a massive failure. This stuff is hard and specialized.

There’s no margin is goofying around with it. Pick an accepted “reasonable” hashing function, use it and call it a day (at least for a 5 years or so) unless you have some specific reason to rethink that choice.

Actually I had no clue that anyone here wrote that code. I glanced at it for less than 2s and closed the window. Most homebrew hash function look somewhat like this.

But I think you’ve answered your own question. Oddly enough the “mathematical properties” of a hashing function are rather (read: all) important.

Bob Jenkin’s (wrote these hashes) has a good overview: http://burtleburtle.net/bob/hash/evahash.html (you have to click around a bit)

In my opinion it’s of no use to throw expertise at newbies. It both confuses them and actually hurts their learning capability, as they are suddenly confronted with details they can’t correctly interpret and will most likely misinterpret as a result, only adding to their confusing.

In this case, the OP doesn’t seem to understand hashing and hash maps in the first place, because if your key contains all the information that is stored in the value, you don’t need the value, and/or you don’t need to lookup anything to start with, because you already have the information you are about to lookup. It’s much more expensive (in CPU terms) to generate the key through string concatenation, than it is to create a new Vec3 object, which is even GC friendlier because it only creates one instance, instead of almost half a dozen, for string concat.

So, as always, maybe one shouldn’t try to strictly answer the question, but to unravel the problem, which might lead to a much better fitting solution. If the question is wrong, a highly detailed answer, showing your expertise to the world - hoping somebody else entirely will learn from it - won’t help anybody.

Having said all that - what is the problem the original poster (Yemto) is trying to solve? I’m sure it has very little, if anything, to do with hashing, let alone picking the best hashing algorithm.

@Riven - that indeed is a good point. I wholeheartedly endorse your post.

[quote=“Riven,post:14,topic:44363”]
Haha, Riven, I’m impressed by your wisdom ;D I’m even quite serious. You sound like Yoda.

It’s one of the things I find hardest to do: take good time to look at the problem, instead of starting to thing in solutions right away… This thread may be a good example… So what IS Yemto trying to achieve with his/her HashMap? I have no clue really… To make it easy to locate objects by location? Just for shits and giggles? Heh. I’m curious though, so maybe Yemto can enlighten us.

[quote=“Roquen,post:13,topic:44363”]
Ah, sorry, I didnt actually write the code, I just linked to it as a proposed solution.

This is very interesting perspective. It’s once again the question of giving a fish or teaching to fish, but I have never thought about adding to the confusion with too much, too complex information. When You think about it, it’s logical, You need some basic understanding before moving up to the advanced setup.

I bow to Your wisdom, mr Riven.

Anyway, back to the topic, indeed noone has yet asked what the poster is trying to do. Given that he has ‘x’, ‘y’ and ‘z’ even a simple

Object[][][]

might be a better solution than a hashmap.

You asked and Riven asked.
Arrays vs HashMap is irrelevant as the extend of the usage is not defined here. There are different tools for different tasks.

[quote=“lukasz1985”]
Depends on how you are locating the object. The hashmap mentioned in the OP can basically only be used to locate objects based on their X,Y and Z values. Locating an object in a threedimensional array is actually quite a bit faster in this case (at least I think so, didnt test it but I assume its basically a single instruction directly fetching the value/object) than locating it in a HashMap (which would involve some searching and invoking of hash functions at least). Anyway, this is all TheoryCraft…