Hello, and welcome to the second episode of the Software Carpentry lecture on testing. In this episode, we're going to show you how to handle errors in program using exceptions. Strictly speaking, this isn't part of testing, but we have to put it somewhere, and since you're going to want to test how your programs behave when things don't go as planned, this seems like as good place a place as any.
It's a sad fact, but things sometimes go wrong in programs.
Some of these errors have external causes…
…like missing or badly-formatted files.
Others are internal…
…like bugs in code.
Either way, there's no need for panic.
It's actually pretty easy to handle errors in sensible ways. First, though, let's have a look at how programmers used to do error handling.
Back in the Dark Ages, programmers would have functions return some sort of status to indicate whether they had run correctly or not.
This led to code like this.
The stuff in green is what we really want.
The stuff in red is there to check that files were opened and read properly, and to report errors and exit if not.
A lot of C and Fortran code is still written this way, but this coding style makes it hard to see the forest for the trees.
When we're reading a program, we want to understand what's supposed to happen when everything works…
…and only then think about what might happen if something goes wrong. When the two are interleaved, both are harder to understand.
The net result is that most programmers don't bother to check the status codes their functions return.
Which means that when errors do occur, they're even harder to track down.
Luckily, there's a better way. Modern languages like Python allow us to use exceptions to handle errors.
More specifically, using exceptions allows us to separate the "normal" flow of control from the "exceptional" cases that arise when something goes wrong.
This makes both easier to understand.
Basically, what exceptions allow us to do is take the code we were just looking at…
…and put the "normal" parts in one place…
…and all the error-handling parts in another.
As a fringe benefit, this often allows us to eliminate redundancy in our error handling.
To join the two parts together, we use the keywords try
and except
. These work together like if
and else
: the statements under the try
are what should happen if everything works, while the statements under except
are what the program should do if something goes wrong.
You have actually seen exceptions before without knowing it.
For example, trying to open a nonexistent file triggers a type of exception called an IOError
…
…while an out-of-bounds index to a list triggers an IndexError
. By default, when an exception occurs, Python prints it out and halts our program.
We can use try
and except
to deal with these errors ourselves if we don't want that to happen.
Here, for example, we put our attempt to open a nonexistent file inside a try
, and in the except
, we print a not-very-helpful error message.
Notice that the output is blue, signalling that it was printed normally, rather than red, which is shown for errors.
When Python executes this code, it runs the statement inside the try
. If that works, it skips over the except
block without running it.
If an exception occurs inside the try
block, though, Python compares the type of the exception to the type specified by the except
. If they match, it executes the code in the except
block.
Note, by the way, that IOError
is Python's way of reporting several kinds of problems related to input and output: not just files that don't exist, but also things like not having permission to read files, and so on, so we can handle several types of error in one place.
We can put as many lines of code in a try
block as we want, just as we can put many statements under an if
.
We can also handle several different kinds of errors afteward.
For example, here's some code to calculate the entropy at each point in a grid.
Python tries to run the four statements inside the try
as normal. If an error occurs in any of them, Python immediately bails out and tries to find an except
whose type matches the type of the error that occurred.
If it's an IOError
, Python jumps into the first error handler.
If it's an ArithmeticError
, Python jumps into the second handler instead. It will only execute one of these, just as it will only execute one branch of a series of if
/elif
/else
statements.
This layout has made the code easier to read, but we've lost something important: the message printed out by the IOError
branch doesn't tell us which file caused the problem. We can do better if we capture and hang on to the object that Python creates to record information about the error.
In Python version 2.6 and earlier, we do this by putting a variable name after the name of the exception type, separating the two with a comma.
If something goes wrong in the try
, Python will create an exception object, fill it with information, and assign it to the variable error
. (There's nothing special about this variable name—we can use anything we want.)
Exactly what information is recorded depends on what kind of error occurred. Python's documentation describes the properties of each type of error in detail, but we can always just print the exception object.
Python 2.7 and higher allow us to make this a bit more readable using the keyword as
. The old style still works, but most new code is written using the new syntax.
Now let's go back and create better error messges.
Here's the modified code.
And here are the changes.
In the case of an I/O error, we print out the name of the file that caused the problem.
And in the case of an arithmetic error, printing out the message embedded in the exception object is what Python would have done anyway.
So much for how exceptions work: how should they be used?
Some programmers use try
and except
to give their programs default behaviors. For example, if this code can't read the grid file that the user has asked for, it creates a default grid instead.
Other programmers would explicitly test for the grid file, and use if
and else
for control flow.
It's mostly a matter of taste, but we prefer the code on the right. As a rule, exceptions should only be used to handle exceptional cases; if the program knows how to fall back to a default grid, that's not an unexpected event. Using if
and else
instead of try
and except
sends different signals to anyone reading our code, even if they do the same thing.
Novices often ask another question about exception handling style as well…
…but before we address it, there's something in our example that you might not have noticed.
Exceptions can actually be thrown a long way: they don't have to be handled immediately.
Take another look at this code.
The four lines in the try
block are all function calls.
They might catch and handle exceptions themselves, but if an exception occurs in one of them that isn't handled internally…
…Python looks in the calling code for a matching except
. If it doesn't find one there, it looks in that function's caller, and so on. If we get all the way back to the main program without finding an exception handler, Python's default behavior is to print an error message like the ones you've been seeing all along.
This rule is the origin of the saying, "Throw low, catch high."
There are many places in your program where an error might occur.
There are only a few, though, where errors can sensibly be handled.
For example, a linear algebra library doesn't know whether it's being called directly from the Python interpreter, or whether it's being used as a component in a larger program. In the latter case, the library doesn't know if the program that's calling it is being run from the command line or from a GUI.
The library therefore shouldn't try to handle or report errors itself, because it has no way of knowing what the right way to do this is. Instead, it should just raise an exception, and let its caller figure out how best to handle it.
Finally, you can raise exceptions yourself if you want to.
In fact, you should do this, since it's the standard way in Python to signal that something has gone wrong.
Here, for example, is a function that reads a grid and checks its consistency.
The raise
statement creates a new exception with a meaningful error message. Since read_grid
itself doesn't contain a try
/except
block, this exception will always be thrown up and out of the function, to be caught and handled by whoever is calling read_grid
.
You can define new types of exceptions if you want to.
And in fact you should, so that errors in your code can be distinguished from errors in other people's code.
However, this involves classes and objects, so we'll cover it in the lecture on object-oriented programming.
Thank you.