Welcome to the fourth episode of the Software Carpentry lecture on program design using invasion percolation as an example. In this episode, we will have a look at how we can fill our grid with random numbers.
If you recall, we're supposed to be create a 2D grid of random values.
We're going to choose those random values uniformly from some range 1 to Z.
We should check the science on this, as there was nothing in our original specification that said the values should be uniformly distributed, or even that they should be integers, but these seem like safe simplifying asusmptions for now.
This is the code we've created so far to build a list-of-lists structure to represent the grid and fill each grid cell with 1.
This is the code that creates the same list of lists, but fills each cell value with a random number.
The changes are pretty small: we're going to import a couple of functions from a library, check the upper bound on our range, initialize the random number generator, and then call randint
to generate a random number.
It will be simpler to understand these changes, though, if we take a look at a small program that does nothing but generate a few random numbers.
The first step is to import functions from the standard Python random number library called (unsurprisingly) random
.
We then initialize the sequence of "random" numbers we're going to generate—you'll see in a moment why there are quotes around the word "random".
We can then call randint
to produce the next random number in the sequence as many times as we want.
Psuedo-random number generators, like the ones found in Python's random
library, have some limitations, and it's important that you understand them before you use them in your programs.
In order to understand them, let's take a look at this very, very simple "random" number generator. It depends on two values.
The base, which is a prime number, determines how many integers you'll get before the sequence starts to repeat itself. After all, computers can only represent a finite range of numbers, so sooner or later, any supposedly random sequence will start to repeat.
Once they do, values will appear in exactly the same order they did before.
The seed controls where the sequence starts. With a seed of 4, the sequence starts at the value 0.
If we change the seed to 9, all it does is shift the sequence over: we get the same numbers in the same order, but starting from a different place.
We'll use this fact later one when it comes time to test our invasion percolation program.
This code is actually a really lousy random number generator.
For example, did you notice that the number 6 never appeared anywhere in the sequence?
That would probably distort our results: it would introduce a bias into our statistics that might be very hard to detect.
What happens when 6 does appear? Well, as you can see, 3 times 6 plus 5 mod 17 is 6 again, and so our sequence gets stuck.
How can we prove that this won't ever happen for an arbitrary seed in a random number generator?
And how can we prove that something subtler won't go wrong?
In fact, computers can't generate real random numbers.
But if you're careful, they can generate numbers with many of the same statistical properties as the real thing.
This is very hard to get right…
…so never try to build your own random number generator.
Instead, you should always use a function from a good, well-tested library…
…like Python's.
Thank you.