Xorshift variation for getting per-coord / parameter numbers

JaredBGreat · April 30, 2016, 4:05pm

Someone may have come up with something like this before, wouldn’t really surprise me. But after trying to figure out a way to get random number per location (for generating terrain), and experimenting with creating per location seeds for Java.Random I realized this could probably be done much better with a home-brew variation on Xorshift. The idea is really simple, use the coords as the basis for (extra) shifts (along with hopefully good ones, just for safety):

The basis is like this:

/**
     * Should produce a random long from seed at coordinates x and y
     *
     * @param seed
     * @param x
     * @param y
     * @return
     */
    public static long longFromSeed(long seed, int x, int y) {
        long out = seed + (12338621L * (long)x) + (15485863L * (long)y);
        out ^= lshift(out, (x % 19) + 7);
        out ^= rshift(out, (y % 13) + 31);  
        out ^= lshift(out, 4);
        return out;
    }
   
   
    /**
     * Should produce a random long from seed at coordinates x, y, t
     *
     * @param seed
     * @param x
     * @param y
     * @param t a fake iteration
     * @return
     */
    public static long longFromSeed(long seed, int x, int y, int t) {
        long out = seed + (5443 * (long)t)
                        + (12338621L * (long)x)
                        + (15485863L * (long)y);
        out ^= lshift(out, (x % 19) + 7);
        out ^= rshift(out, (y % 13) + 31);  
        out ^= lshift(out, 4);    
        out ^= rshift(out, (t % 5) + 23);
        return out;
    }
   
   
    /**
     * should generate num long values using longFromSeed.
     *
     * @param seed
     * @param x
     * @param y
     * @param num
     * @return
     */
    public static long[] longsFromSeed(long seed, int x, int y, int num) {
        long[] out = new long[num];
        out[0] = longFromSeed(seed, x, y);
        for(int i = 1; i < num; i++) {
            out[i] = longFromSeed(out[i-1], x, y);
        }
        return out;
    }
   
   
    /*=============================*
     *  INTERNAL UNTILITY METHODS  *
     ==============================*/
   
   
    /**
     * Performs left bit shift (<<) with wrap-around.
     *
     * @param in
     * @param dist
     * @return
     */
    private static long lshift(long in, long dist) {
        long out = in << dist;
        out += (in >>> (64 - dist));    
        return out;
    }
   
   
    /**
     * Performs right bit shift (>>) with wrap-around.
     *
     * @param in
     * @param dist
     * @return
     */
    private static long rshift(long in, long dist) {
        long out = in >>> dist;
        out += (in << (64 - dist));    
        return out;
    }
}

The second method is for creating fake sequence, with using “t” as a kind of fake iteration of calls (obviously, does not get the same number as call t time, but not the point). The third is to get a bunch of number by using traditional repetition of getting the next number.

Beyond that the full class does include methods for deriving other types, as well as instance methods that use a pre-store seed. (For terrain gen, only one is needed per world, as however it might be devided up into sub-section, all you need are the coords.)

The full class, which I plan on using in a project I’m working on (with further development), I’ve put on pastebin to keep this short.
http://pastebin.com/9qXZhS45

Roquen · April 30, 2016, 5:43pm

You can get better mixing by mixing algebra groups…which means less operations. I’ve always sucked at attempting to come up with this kind of thing. Examples are the bit-finalizers of mumurhash and xxhash, etc. etc.

I recently read a blog post about this kind of thing…I’ll look through my history to see if I can find it.

JaredBGreat · April 30, 2016, 5:52pm

Thanks, I’ve been trying to come up with something like this for a while – put together several RNG that were basically garbage. When I heard of Xorshift the lightbulb came on – but even if I stick with this approach it needs to improvement, a bit of which I’ve done (perturbed the seed to avoid repeats with x and y loop around), but it still has a slight bias toward 0.

A test generation 0-9 gives:

0’s: 766966
1’s: 669401
2’s: 599692
3’s: 598700
4’s: 602171
5’s: 621632
6’s: 618932
7’s: 602576
8’s: 605209
9’s: 606177

Total = 27410618
Numebr = 6291456
Average = 4.3568006

(Also, not sure I may have over done some things {“t” twice? And does wrapping around help me in any way?} – so, it not optimized at all.)

EDIT: I’ve also modified it slightly in the short run to prevent early wrap around based on x and y (without creating special seeds elsewhere):

public long longFromTile(int x, int y, int t) {
        long out = seed + (5443 * (long)t) 
                        + (12338621L * (long)x) 
                        + (15485863L * (long)y);
        out ^= lshift(out, (x % 19) + 7);
        out ^= rshift(out, (y % 13) + 31);
        out ^= lshift(out, 4);  
        out ^= rshift(out, (t % 5) + 23);
        return out;
    }

I’ll look into other ways of optimizing it. Not to concerned about long periods at any one spot though, since it would probably used to create a small set of number for perlin noise, cellular automata, or similar. These improvements also fix the zero bias.

delt0r · May 1, 2016, 5:31am

There have been posts on this in the past. Including my own posts. I just spent a while going through em… it was too long ago. But I used msequences. Very fast and proper pseudo random proprieties.

Roquen · May 1, 2016, 8:35am

m-sequence + counter have the same problem - the only change algebraic structure once. Z for counter, F2 for m-sequence. It is a solution that works though…esp if you don’t need good statistical properties (which you don’t for the listed usages). This is a problem much harder than it seems.

JaredBGreat · May 1, 2016, 11:36am

I did a search to see if there was anything similar, but didn’t know what key words to look for, and found nothing with the terms I searched. It would be interesting to see what some of these posts actually are. I see again if I can find them.

I did test the newer version using these tests:

Though I don’t know enough about all the tests to fully interpret all them. I does seem to perform about the same as java.util.Random or between that and java.security.SecureRandom, depending on the test, and profile at about the same speed as java.Random (could probably be much faster if I didn’t use the wrap-around shifting functions). Seems less biased with low number than java.Random, too, now. I’ve yet to test it with an actual, practical use though; I’m sure it will be much better than my early naive attempts, which created obvious patterns.

delt0r · May 1, 2016, 8:57pm

Well one that isn’t totally bad and fairly fast is
((a+x)*b+y)*c+z)*d
a,b,c and d are large primes. I use the 4 largest primes for the number size i use (4 largest smaller than 2^64 for longs.)