Weighted random choice in PHP

PHP's native random_int() function can be used to easily create a weighted random choice between multiple, otherwise equal options.

Making a random choice with PHP is relatively straightforward. To pick evenly between several items, assign each item a number. Then use random_int() to select an random number and you’ve selected the item assigned to it.

Making a weighted random choice is a bit more complicated. The resulting output of random_int() is uniform. Over enough iterations, it will eventually select every possibility with roughly the same frequency. If we want one choice to be “more possible” than the others, we have more work to do.

Raffles as a proxy

Years ago, my high school offered a 50-50 raffle during football games. Anyone could buy a ticket for $1 and, at the end of the third quarter, they’d draw the winning ticket out of a bucket. The football program kept half, and the winner got the other half.

Since these were tickets you could buy, rather than a raffle based on merely being attendance, you could purchase several attempts to win. The more tickets you bought, the more chances you could win!

This made the raffle a weighted random choice between participants, and it’s exactly what we need to do in PHP.

Changing the model

Rather than assigning each option a number, we order our options by weight. Assume in this model we’re trying to randomly select a piece of fruit for breakfast. Due to a mistake with our grocery order, we have far too many bananas, but only a couple of pears.

This means we want to increase the chance of selecting a banana while reducing the chance of selecting a pear. We have plenty of apples and oranges, though, so they’ll have an equal chance of being selected.

In code, we’ll model this as an array where each choice maps to its weight:

$fruits = [
  'banana' => 10,
  'apple'  => 5,
  'orange' => 5,
  'pear'   => 1,
];

To flesh this out entirely, we’re going to select a random integer between 1 and the sum of all of our weights – in this case 21. If the result is between 1 and 10, we choose a banana. If it’s exactly 21, we choose a pear.

foreach(range(0, 1000) as $i) {
  $fruit_chosen = 'bread';
  $total_weight = array_sum(array_values($fruits));
  $selection = random_int(1, $total_weight);

  $count = 0;
  foreach($fruits as $fruit => $weight) {
    $fruit_chosen = $fruit;
    $count += $weight;
    if ($count >= $selection) {
      break;
    }
  }

  echo $fruit . PHP_EOL;
}

This works because each integer result is equally possible thanks to random_int(), but due to our weights we have 10 chances to pick a banana versus a single chance to choose a pear.

If we plot the frequency with which each option is chosen as a pie chart, we can easily see that:

  • We select bananas twice as often as either apples or oranges
  • We choose apples and oranges with equal frequency
  • We choose bananas ten times as frequently as pears
Frequency of fruit selection with a weighted random choice.

PHP is a powerful language. The built-in random selection functions are remarkably useful for making truly random choices. Knowing how to leverage these functions to make less-than random but still unpredictable choices is critical to using the language well.

#