Hello, and welcome to the ninth episode of the Software Carpentry lecture on Python. In this episode, we'll take a closer look at some things you can do with functions that will allow you to do more with less code.
As we've seen in previous episodes, an integer is just 32 bits of data…
…that a variable can refer to.
And a string is just a sequence of bytes…
…that variables can also refer to.
Well, it turns out that a function is just another sequence of bytes—ones that happen to represent instructions…
…and yes, variables can refer to them to.
This insight—the fact that code is just another kind of data, and can be manipulated like integers or strings—is one of the most useful and powerful in all computing.
To understand why, let's have a closer look at what actually happens when we define a function.
These two lines of code tell Python that
threshold is a function that returns one over the sum of the values in
When we define it, Python translates the statements in the function into a blob of bytes, then creates a variable called
threshold and makes it point at that blob.
This is not really any different from assigning the string
'alan turing' to the variable
name: the only difference is what's in the memory the variable points to.
threshold is just a reference to a value in memory, we should be able to assign that reference to another variable.
Here's our starting point once again.
And here's the assignment:
t = threshold. As you can see,
t now points to the same data as
threshold: it is an alias for the function.
To prove this is so, let's try calling
t. The result is exactly what we would get if we called
threshold with the same parameters, because
threshold are the same function.
Here's another thought: if the function is "just" data, can we put a reference to it in a list?
Let's define two functions,
circumference, each of which takes a circle's radius as a parameter and returns the appropriate value.
Once those functions are defined, we can put them into a list like this. Of course, what we really mean is we're copying the references to the functions stored in the variables
circumference into the list, so that its first and second elements refer to the same blobs of instructions.
We can now loop through the functions in the list, calling each in turn.
Sure enough, the output is what we would get if we called
area and then
circumference with the parameter 1.0.
Let's go a little further. Instead of storing a reference to a function in a list, let's pass that reference into another function, just as we would pass a reference to an integer, a string, or a list.
Here's a function called
call_it that takes two parameters: a reference to some other function, and some other value. All
call_it does is call that other function with the given value as a parameter.
Let's test it with
area and 1.0: that's right.
circumference and 1.0: right again.
So far, so pointless, but now it's time for the payoff: functions of functions.
Here's a function called
do_all that, as its name suggests, applies some function—anything at all that takes one argument—to each value in a list, and returns a list of the results.
If we call
area and a list of numbers…
…we get what we would get if we called
area directly on each number in turn.
And if we define a function to "slim down" strings of text by throwing away their first and last characters…
…we can apply it to every string in a list, without having to copy the code that loops through the list, calls the function, and concatenates the results.
Functions that operate on other functions are called higher-order functions. They're common in mathematics: integration, for example, is a function that takes some other function and two endpoints as parameters. In programming, higher-order functions allow us to re-use control flow rather than rewriting it.
Here's another example.
combine_values takes a function and a list of values as parameters, and "adds up" the values in the list using the function provided, returning a single value as its result.
To show how this works, let's define
mul to add and multiply values.
If we combine 1, 3, and 5 with
add, we get their sum, 9.
If we combine the same values with
mul, we get their product, 15. This same higher-order function
combine_values could concatenate lists of strings, too, or multiply several matrices together, or whatever else we wanted, without us writing the loop logic ever again.
Higher-order functions are a good thing because they let us do more with less code. If we don't use higher-order functions…
…then we have to write one function for each combination of data structure and operation, i.e., one function to add numbers, another to concatenate strings, a third to sum matrices…
…and so on.
With higher-order functions, on the other hand…
…we only write one function for each basic operation, and one function for each kind of data structure…
…and since A plus B is usually a lot smaller than A times B, this saves us coding, testing, and debugging.
Of course, we have to know something about the function our higher-order function is operating on…
…like how many arguments it takes.
But in Python and many other languages, we can even get around that.
Here's a function called
add_all that sums up values using plus.
Notice the '*' in front of the parameter
args. This tells Python to take all of the parameters passed into the function when it's called and put them together in a special type of list called a tuple, which we'll explore in more detail in a couple of lectures.
If we call
add_all with no arguments, the tuple that's assigned to
args has no elements, so
add_all returns 0.
If we call
add_all with the integers 1, 2, and 3, though,
args has three elements, which
add_all sums up to get 6.
We can use this "catch-all" parameter with regular parameters, as long as the catch-all comes last.
Here, for example, is another version of
The only difference between it and the previous version is the '*' in front of the parameter
This small change means that we don't have to put the values we want to combine into a list before calling the function: the first actual parameter is assigned to
func as before, and everything else goes into the tuple
As an aside before we finish this episode, what do you think
combine_values will do if we only provide a function, and no values for that function to operate on?
More importantly, what do you think it should do?
Several higher-order functions are actually built in to Python. One is
filter, which constructs a new list containing all the values in an original list for which some function is true.
map, which applies a function to every element of a list, returning a list of results…
…and then there's
reduce, which combines values using a binary function, returning a single value as a result.
For example, if
positive is True when its argument is greater than or equal to 0, then
positive and a list of numbers returns a list of non-negative numbers.
negate changes the sign of its argument,
negate returns a list of negated values…
…and of course, using
add returns the sum of the values in the list.
It usually takes a while to get used to working with higher-order functions, but while the ideas are digesting, let's step back and ask, "What is programming anyway?"
Novices usually think that it means writing instructions for a computer.
But more experienced programmers think of it as creating and combining abstractions.
When we're programming, our goal is to spot a pattern, like "combine all elements of a list using a binary function".
And then write it down once, as clearly as possible…
…so that we can build more patterns on top of it.
Be cautious, though: the limits of human short-term memory still apply.
If you pile too many abstractions on top of one another, it can be difficult to figure out what the end result actually does.