Find us on GitHub

Teaching basic lab skills
for research computing

Matrix Programming: Linear Algebra

Matrix Programming/Linear Algebra at YouTube

Hello, and welcome to another episode of the Software Carpentry lecture on matrix programming. In this episode, we'll have a look at how to do linear algebra.

As we saw in previous episodes, NumPy arrays make it easy to do things with rectangular blocks of data. But that doesn't mean they behave exactly like the matrices that mathematicians use. For example, let's create an array, and then multiply it by itself. As you can see, NumPy has not done what a mathematician would do. Instead, it has done the operation elementwise.

On the bright side, elementwise operation means that array addition works as you would expect. And since there's only one sensible way to interpret an expression like "array plus one", NumPy does the sensible thing there too.

Like other array-based libraries or languages, NumPy provides many useful tools for common operations. For example, if we have the array shown here, then we can sum its values with a single function call. We can also calculate the partial sum along each axis… …just by passing an extra argument into the sum function.

As an exercise, what do you think sum(a, 2) will do for a two-dimensional array?

Here's a long example of what we can do with array operations. Suppose we have been observing the progress of a disease in some test subjects. Each row of our array corresponds to one patient. And each column is an hourly count of responsive T cells. This means that the zeroth column of our data is the initial T cell count for all patients… While the zeroth row is all hourly samples for patient 0.

As an exercise, check that you understand the slicing notation used here, and can explain why the slices are one-dimensional rather than two-dimensional.

The expression 'mean(data)' gives us the average T cell count for all patients at all times. It's nice to know we can do this, but it's not a particularly meaningful statistic. The mean of the data along axis 0, on the other hand, gives us the average across all patients for each hour. This is much more useful: it is the "normal" progress of the disease. And the mean along axis 1 gives us the average T cell count per patient across all times, which could be useful if we need to normalize the data.

It might be even more interesting to look at what happened to people who started with no responsive T cell count at all. The first step is to once again select the first column of data, i.e., the initial T cell counts for each patient. If we compare these to zero, we get a Boolean array with True for each row of the array that meets our criteria. If we use this to index the original array, we get the two rows for which the count at time zero is zero.

Now let's find the mean T cell count over time for those people. Once again, we start by selecting column 0…

…and testing it to create a Boolean mask.

Using that mask as a subscript gives us the rows that have zero in the first place.

We can now use the 'mean' function along axis zero—i.e., across patients.

Which gives us the average behavior of patients who started with no responsive T cells at all.

This example highlights the key to good matrix programming: write high-level statements without loops, and let the computer worry about how to do the operations element by element. This is just as true for MATLAB or R as it is for Python:

All right, so what about "real" matrix multiplication? NumPy provides a function called 'dot' that does this, and other useful things. 'dot' of two matrices is their mathematical matrix product. Dot of two vectors is their dot product, i.e., the sum of the products of their elements.

Just as in mathematics, though, dot only works for things with compatible shapes. For example, you cannot multiply a 2×3 matrix and another 2×3 matrix: NumPy complains that the objects aren't aligned. And please note that NumPy does not distinguish between row and column vectors: if 'v' is a 1×2 vector, NumPy will happily calculate either of its products with a 2×2 matrix.

If you really prefer multiplication with a '*' to calling the function 'dot', you can store your data in a class called 'matrix' instead of in an 'array'. 'matrix' has all of the features of 'array', but its '*' operator does what a mathematician would expect. You can convert one form to the other using the 'matrix' and 'array' functions.

So which should you use? If your program is doing linear algebra, 'matrix' will probably be more convenient… …not least because it treats vectors as Nx1 matrices. Otherwise, you should use 'array'. In particular, if you are representing a multi-dimensional grid of some sort, rather than an abstract matrix of numbers, 'array' is better choice.

Finally, always remember that you're not the first person to program with matrices. Always take a look at the online documentation for NumPy before writing any functions of your own. The library includes routines to conjugate, convolve, and correlate matrices… …to extract diagonals… …calculate FFTs, gradients, histograms, and least squares… …or net present value if you're doing financial mathematics. It can find roots… …solve sets of linear equations… …and do singular value decomposition. These functions are all faster than anything you could easily write… …and what's more, someone else has tested and debugged them.

In our next episode, we'll have a look at a longer example of how you can use NumPy to solve real-world problems.