VERY ROUGH WORK IN PROGRESS…NO REAL USEFUL INFORMATION AS OF YET.
A random number generator is like sex;
When it’s good, it’s wonderful,
And when it’s bad, it’s still pretty good. George Marsaglia
Games? Crank 'em out as fast as you can; Wham! Bam! Thank you ma’am! Roquen
[h3]Overview[/h3]
The purpose of this page is to give an rough overview of how pseudo random number generators (PRNG) work and a rough idea of how to choose ones which are adequate for a game runtime.
[h3]Rules of thumb[/h3]
- Never use the same instance of a generator in more than one thread.
- Never create static utility methods which call a generator unless single threaded (same as previous).
- If using a JDK supplied generator, use ThreadLocalRandom.
XXX Use different generators if you need a more expensive one for a sub-set of tasks…oh, and don’t really worry about any of this stuff if you just don’t generate many random numbers for a given task.
Everything below this point can (and should) be ignored unless you are generating a large numbers of random numbers per frame OR you tempted to use OR are using a “good” random number generator especially one with a period greater than 264. If a your most expensive AI entity needs up to half a million random numbers per frame, then a barely reasonable PRNG with a period of 232 should be more than sufficient.
Again this page is only dealing with PRNGs for game runtimes where “real” money is not directly involved…as in for gambling.
[h3]Introduction[/h3]
The vast majority of PRNGs all work in the same way. They take some state-data in the form of a sequence of bits and perform computations which mix the bits. Notice that hashing and random number generation are very similar. A given generator will have a period which is the number of unique values the state will assume before wrapping around and repeating itself. Yeah, that was clear. OK, let’s pretend like we’re using a 2-bit integer for our state data, so our 2-bit integer can have values of: {0,1,2,3}
. For our random number generator we want to produce some permutation (MathWorld, Wikipedia) of that set which appears random. The number of permutations of a set with n elements is Factorial[n]
, so we have 24 choices for our 2-bit integer. Limiting the set to those that start with ‘0’ (all the remaining are rotations), we have: {0,1,2,3},{0,1,3,2},{0,2,1,3},{0,2,3,1},{0,3,1,2},{0,3,2,1}
.
Now obviously none of the above permutation choices look very random but this is exactly how random number generators work. The problem here is that there just aren’t enough unique values that any of the permutations appear random. Luckily factorials explode and if we were talking about only 8-bits then there would be approximately 8.6x10506 choices and 16-bits is ~5.2x10287193. Other than some very narrow special cases (like in GLSL, when attempting to avoid using integers) there’s no reason to use generators with less than 32-bits.
An implication of this is that the period can be at most 2n, where ‘n’ is the number of bits. On this page we will only consider “full period generators” which will have a period of either 2n or 2n-1. The minus one comes into play for generators for which zero is an invalid state.
This does not imply that PRNGs with longer periods are superior in quality than those that are shorter.
[h3]Period[/h3]
As mentioned above the period of a PRNG is the length of the sequence that it generates. Once the period is exhausted, the sequence wraps-around and starts to repeat itself.
For scientific usage long periods are important. The reason for this is that statistically they will begin show defects if in “single computation” more than sqrt(T)
random numbers are used, where T
is the period. Note that the sqrt(T) = T<sup>1/2</sup>
. Even more important is because one can break a problem across multiple CPUs/servers/whatever to attack a single problem and the longer the period, the lower the probability that the individual computation units will be using overlapping parts of the sequence.
Note the “single computation” part. There’s the mistaken notion in games that one needs a period long enough that it never repeats during gameplay. Not at all, it simply should be long enough that every single computation has more than enough values to complete.
So what about the period? Do we need to think in terms of T or sqrt(T)? The question is more or less moot. As noted above there’s no reason to use a generator with a period shorter than ~232, so you should feel safe if any single problem needs no more than 216 or 65536 random numbers, which seems highly unlikely. At the opposite end of the spectrum (SEE: url=http://www.java-gaming.org/topics/orders-of-magnitude/27917/view.html]orders of magnitude[/url]) it would take the Cray Titan more than 17 minutes to generate 264 random numbers and an infinite amount of time to generate 2128. Sticking the sqrt(T)
rule of thumb this means you never need a generator with a period greater than 2128.
[h3]Dimensions[/h3]
Humm…any ideas here kids, other than don’t worry about dimensions. Just don’t use a LCG for more than 2.
[h3]Quality[/h3]
PRNGs do not attempt to create sequences which appear random. Instead they are attempting to simulate statistical randomness.
For our purposes the main usefulness of having a generator which passes the majority to all of XXXX
XX note uniform distribution
XX Write something somehow convincing about how quality a la. diehard and various crushes shouldn’t matter.
[h3]Old Skool[/h3]
Permutation Polynomials
SEE: GLSL Tricks
Weyl Generators
This family is based on the equidistribution theorem (properties of irrational numbers). Quality varies widely depending on formulation and precision. Like permutation polynomials they can be computed purely in floating point and are currently useful on the GPU.
The simplest version looks like:
u<sub>n</sub> = na mod 1 = (u<sub>n-1</sub>+a) mod 1
where:
‘a’ is a fixed irrational number mod 1.
‘n’ is the number in the sequence (0,1,2…)
Common variants include nested:
u<sub>n</sub> = n(na mod 1) mod 1
and shuffled nested (where ‘m’ is a fixed large integer):
v<sub>n</sub> = m(n(na mod 1) mod 1)+1/2
u<sub>n</sub> = (v<sub>n</sub>(v<sub>n</sub> a mod 1)) mod 1
vec3 weyli(vec3 s, vec3 a) { return fract(s+a); } // simple incremental
vec3 weyl(vec3 n, vec3 a) { return fract(n*a); } // simple explict
vec3 weyln(vec3 n, vec3 a) { return fract(n*fract(n*a)); } // nested
vec3 weylns(vec3 n, vec3 a) // shuffled nested
{
n = m*fract(n*fract(n*a))+0.5;
return fract(n*fract(n*a));
}
On the CPU side these are of limted usefulness, mainly integer versions can be used in a combined generator (below). Integer versions use a scaled irrational number rounded to the nearest odd value. This results in an abysmal quality generator with a period of 2n where ‘n’ is the number of bits in the integer.
/* 2<sup>31</sup>(3-sqrt(5)) = 0x61c88647 */
private static final int WEYL = 0x61c88647;
private int weyl;
private void updateState() { weyl += WEYL; }
Linear Congruent Generators
Linear congruent generators are among the oldest and best understood family of generators. Their quality can vary from poor to state-of-the-art. It’s only worth considering the poorest quality and fastest variants: standard integers with power-of-two modulus. The first variant, the MLCG, is a single multiply and has a period of 2n-1 (zero is an illegal state). This is worth mentioning for the 4K contest.
private int state;
private static final int K = 741103597;
public int nextInt() { state = K*state; return state; }
Of more general interest is the most common variant, which simply adds an odd constant at every step.
private int state;
public final int nextInt() {
state = 0xac549d55 * state + 0x61c88647;
return state;
}
LCGs have a very bad reputation. This is mainly due to some early C compilers using poorly chosen constants. Poorly chosen constants severly impact the quality of the generated sequence and this is true for all families of generators. A real issue with power-of-two LCGs is that the quality decreases based on bit position. The highest bit is the most random and each following bit is slightly less than the previous and the lowest bits aren’t random at all. This is less of a concern than one might imagine assuming your using a library with helper functions (nextInt(int n), nextFloat(), nextBoolean()
, etc) that take this fact into account. In this case the user might only need to consider this fact if using raw state: ‘nextInt
’ in the case of 32-bit and ‘nextLong
’ in the case of 64-bit generators. As an example calling nextInt()
and using the result as a set of boolean one should use the top bits.
[h3]New kids on the block[/h3]
A fair number of modernish generators are based on linear-shift-feedback register (LSFRs). Most popular generators which are considered state-of-the-art are based on (generalized) LSFR, such as Mersenne twister, WELL, et al. Generators like MT and WELL are insane overkill. For game related purposes interesting ones include: XorShift, XorWow and M-Sets.
[h3]Combined generators[/h3]
When one family of generators has a statistical weakness at a given set of tests it can be combined with another family of generators that performs well at those tests.
[h3]Summary[/h3]
Reference
Blah, blah