Why do people say there is modulo bias when using a random number generator?

So rand() is
a pseudo-random number generator which chooses a natural number between 0 and
RAND_MAX, which is a constant defined in cstdlib (see this article for a
general overview on rand()).

Now what
happens if you want to generate a random number between say 0 and 2. For the
sake of explanation, let's say RAND_MAX is 10 and I decide to generate a random
number between 0 and 2 by calling rand()%3. However, rand()%3 does not produce
the numbers between 0 and 2 with equal probability! When rand() returns 0, 3,
6, or 9, rand()%3 == 0. When rand() returns 1, 4, 7, or 10, rand()%3 == 1. When
rand() returns 2, 5, or 8, rand()%3 == 2. Now if we analyze this statistically,
we very quickly see that the probability of getting a 0 is 4/11, 1 is 4/11 but
2 is 3/11. This does not generate the numbers between 0 and 2 with equal
probability. Of course for small ranges this might not be the biggest issue but
for a larger range this could skew the distribution, biasing the smaller
numbers.

So when does
rand()%n return a range of numbers from 0 to n-1 with equal probability? When
RAND_MAX%n == n - 1. In this case, along with our earlier assumption rand()
does return a number between 0 and RAND_MAX with equal probability, the modulo
classes of n would also be equally distributed.

So how do we
solve this problem? One way is to keep generating random numbers until you get
a number in your desired range:

Post:2

Points:10## Why do people say there is modulo bias when using a random number generator?

So rand() is a pseudo-random number generator which choosesa natural number between 0 and RAND_MAX, which is a constant defined in cstdlib (see this article for a general overview on rand()).Now what happens if you want to generate a random number between say 0 and 2. For the sake of explanation, let's say RAND_MAX is 10 and I decide to generate a random number between 0 and 2 by calling rand()%3. However, rand()%3 does not produce the numbers between 0 and 2 with equal probability! When rand() returns 0, 3, 6, or 9, rand()%3 == 0. When rand() returns 1, 4, 7, or 10, rand()%3 == 1. When rand() returns 2, 5, or 8, rand()%3 == 2. Now if we analyze this statistically, we very quickly see that the probability of getting a 0 is 4/11, 1 is 4/11 but 2 is 3/11.

This does not generate the numbers between0 and 2 with equal probability. Of course for small ranges this might not be the biggest issue but for a larger range this could skew the distribution, biasing the smaller numbers.So when does rand()%n return a range of numbers from 0 to n-1 with equal probability? When RAND_MAX%n == n - 1. In this case, along with our earlier assumption rand() does return a number between 0 and RAND_MAX with equal probability, the modulo classes of n would also be equally distributed.

So how do we solve this problem? One way is to keep generating random numbers until you get a number in your desired range:int x;

do {

x = rand();

} while (x >= n);

Hope that helps everyone!