Good morning, and welcome to the next episode of the Software Carpentry lecture on program design using invasion percolation as an example. In this episode, we're supposed to talk about testing…
…but there's something else that we have to do first.
In our previous episode, we found and fixed one bug in our program…
…but how many others haven't we found?
More generally, how do we validate and verify a program like this? Those two terms sound similar, but mean different things.
Verification means, is our program free of bugs? I.e., did we build the thing right?
Validation means, are we implementing the right model? I.e., did we build the right thing?
The second question is one for scientists to answer…
…so we'll concentrate in this lecture on testing our program.
Except, as we said earlier, there's something we have to do first. We're actually going to look at how we make our program more testable. Testing anything that involves randomness is difficult, so in order to convince ourselves that our program is working, we need to come up with examples that aren't random.
Here's one: this grid has the value 2 everywhere…
…except in three cells that we have filled with 1's.
If our program is working correctly, it should fill exactly those three cells and nothing else.
If it doesn't, it should be pretty easy for us to figure out what's gone wrong.
Here's the overall structure of our program as it currently stands.
fill_grid, is the one we want to test…
…so let's reorganize our code to make it easier to create specific grids.
Grids are created by the function
create_random_grid, which takes the grid size and random value range as arguments.
Let's split that into two pieces.
The first creates an N×N grid containing the value 0.
The second overwrites those values with random values in the range 1 to Z.
We can then call something else to fill the grid with non-random values when we're testing. This change is pretty simple, and is left as an exercise for the viewer.
Here's another part of the old program that we need to change. It takes command-line arguments and converts them into integers in order to determine the grid size, the range of random values, and the random number seed.
Our new structure is going to use a function called
parse_arguments to do the same job.
We're also going to introduce a new argument in the first position called scenario. It doesn't need to be converted to an integer: it's just a string value specifying what we want to do. If the user gives us the word "random", we'll do exactly what we've been doing all along.
If the user gives us anything else, for the moment we will fail, but later on, we'll use
scenario to determine which of our test cases we want to run.
We're not going to need random numbers when we fill the grid manually for testing.
We're also not going to need the value range…
…or the grid size.
Let's move argument handling and random number generation seeding into the
if branch that handles the random scenario.
Once we make this change…
…we determine the scenario by looking at the first command-line argument…
…and then if that value is the word "random"…
…we look at the remaining arguments to determine the grid size, the value range, and the random seed.
If the first argument isn't the word "random", then we fail.
Here's a closer look at what's inside that first
We parse arguments…
…seed the random number generator…
…create a grid…
…fill it with random values…
…mark the center cell as filled…
…fill the rest of the grid…
…and then print out how many cells were filled.
The names of the functions
fill_grid are very similar—it would be very easy for people to confuse them if the code was being read aloud.
Let's rename the first one to
init_grid_random. The functions that initialize the grid for specific test cases will then be called
And while we're here, let's clean up something that we first pointed out a couple of episodes ago. We are manually filling the center cell, and then calling
fill_grid to fill the remainder. That means we have to add 1 to the result returned by
fill_grid. Since we do that in all of our scenarios…
…let's just move that code into
fill_grid, so the
fill_grid function now marks the center cell, and fills until it reaches the boundary, and returns the total number of cells filled. This is less of a burden on the people using our code, because they don't have to remember to mark the center cell themselves, or to add 1 to the result of
Here's the structure of our revised program. We have the documentation string—which, by the way, we've updated to remind people that the first argument is the name of the scenario. Our
fail function hasn't changed.
We've split grid creation into two functions.
fill_grid function now fills the middle cell and returns the count of all filled cells.
And we have a function to parse command-line arguments.
This argument-parsing function is actually specific to the random case.
We should probably rename it…
…to make that clear.
Now let's step back. We were supposed to be testing our program, but in order to make it more testable…
…we had to reorganize it first, and the jargon word for this is…
This means "changing a program's structure without modifying its behavior or functionality in order to improve its quality."
Entire books have been written about how to refactor programs systematically.
The first, and most influential, was by Martin Fowler.
It's mostly a catalog of refactoring techniques for object-oriented programs.
Another is by Michael Feathers. He discusses how to refactor legacy programs—i.e., programs that you've inherited that may be very tangled, poorly documented, or (in most cases) both—in order to make them easier to test.
The examples in this book are drawn from many different languages.
And now that we've done this refactoring, we can start to test our program.