Why you should not do modulo bias for randomness

Johan Attali bio photo By Johan Attali

While reading a great article on NSHipster about randomness, they suggested to use arc4random_uniform() instead of using the remainder of a arc4random() call.

The article simply points to another site for explaining why we should use this method for modulo bias, but the article itself doesn't explain clearly why using the remainder % (or also called modulo bias) is bad behavior.

It only says that using the % only works if n devides RAND_MAX and here is why:

Imagine that RAND_MAX is equal to 11 and you want a random number between 0 and 4. You would write:

int r = arc4random()%5;

And you would indeed get a random number between 0 & 5. You would have more chances to have 0 and 1 than 2,3 or 4. And here is why:

arc4random()01234567891011
arc4random()%5012340123401

So the probabily of having 0 is equal to 1/4 while the probabily of having 2 is 1/6. Therefore you loose uniform distribution.

You'll also notice that if n devides RAND_MAX then, the distribution is uniform again.

arc4random()01234567891011
arc4random()%1201234567891011

The proper way of generating a random number between 0 & 4 on iOS is to use the function arc4random_uniform()

// random number between 0 & n
int r = arc4random_uniform(n);
// random number between n & m;
int r = n + arc4random_uniform(m - n);

Also if you look at the value of RAND_MAX on iOS you would see:

// RAND_MAX = 2147483647 = 2^31 − 1
#define RAND_MAX  0x7fffffff

RAND_MAX is the 8th Mersenne prime number, so there is no way that you could ever take a value of n that nicely devices RAND_MAX.