Find us on GitHub

Teaching basic lab skills
for research computing

Python: Slicing

Python/Slicing at YouTube

Hello, and welcome to the twelfth episode of the Software Carpentry lecture on Python. This episode will show you how to take sections out of lists, strings, and tuples.

Lists, strings, and tuples are all sequences.

Which means they can all be indexed by integers in the range 0 to len minus 1.

But they can also be sliced using a range of indices.

To see how slicing works, let's assign the string 'uranium' to the variable element.

If we index the string with the expression 1:4, we get back the characters from index 1, up to but not including index 4, i.e., characters 1, 2, and 3.

If we don't specify the lower bound, it defaults to 0, which is the start of the string.

Similarly, if we don't specify the upper bound, it defaults to the end of the string.

And we can use negative indices as bounds too, which count positions backward from the end of the string. The slice expression -4: gives us characaters -4, -3, -2, and -1.

Indices are interpreted the same way for slices as they are for single elements, but there is one important difference. When a single index is used to get one element from a string or list, Python always checks that it's in bounds, and gives an error if it isn't.

But Python just truncates out-of-bounds values when slicing.

Here's our string 'uranium' again.

If we try to get element 400, we get an error, because the string isn't that long.

If we take a slice from index 1 up to index 400, though, Python rounds down the upper bound for us.

Some people find this useful…

…but others trip over the inconsistency from time to time.

This behavior is handy, though: no matter how long the string text is, the expression text[1:3] is always legal, but its value may be zero, one, or two characters long.

Have a look at these examples and make sure you understand why each one has the value it does.

To be consistent, the expression text[1:1] is always the empty string.

Because going from location 1, up to but not including location 1, is an empty range.

Carrying on, if the lower bound is greater than the upper bound, that's an empty string too.

Not the reverse of the string that would be selected if the bounds were reversed.

However, when we compare bounds, we have to remember that negative indices count backward: text[1:-1] is everything except the first and last characters of the string, because the index -1 means "the next-to-last legal index of the sequence". It may look odd, but it's consistent.

The other thing that's important to remember about slicing is that it always creates a new object.

But only the thing being sliced is copied, so aliasing is still possible.

Here's an example: the list points has four element, each of which is a reference to a two-element list.

The expression points[1:-1] creates a new list with two elements. Those two elements are references to the second and third of the sublists that points referred to.

If we change the content of those sublists…

…by reaching through middle

…then not only is middle changed…

…but points appears to change as well.

Let's have another look at what just happened. Here's how points is laid out in memory.

When we slice it to create middle, Python copies the second and third elements of points, but those values are references to other lists. points and middle now both contain references to those sublists.

If we overwrite middle[0][0]

…and middle[1][10], the changes are shared by points.

Thank you.