Hello, and welcome to the twelfth episode of the Software Carpentry lecture on Python. This episode will show you how to take sections out of lists, strings, and tuples.
Lists, strings, and tuples are all sequences.
Which means they can all be indexed by integers in the range 0 to
len minus 1.
But they can also be sliced using a range of indices.
To see how slicing works, let's assign the string
'uranium' to the variable
If we index the string with the expression
1:4, we get back the characters from index 1, up to but not including index 4, i.e., characters 1, 2, and 3.
If we don't specify the lower bound, it defaults to 0, which is the start of the string.
Similarly, if we don't specify the upper bound, it defaults to the end of the string.
And we can use negative indices as bounds too, which count positions backward from the end of the string. The slice expression
-4: gives us characaters -4, -3, -2, and -1.
Indices are interpreted the same way for slices as they are for single elements, but there is one important difference. When a single index is used to get one element from a string or list, Python always checks that it's in bounds, and gives an error if it isn't.
But Python just truncates out-of-bounds values when slicing.
Here's our string
If we try to get element 400, we get an error, because the string isn't that long.
If we take a slice from index 1 up to index 400, though, Python rounds down the upper bound for us.
Some people find this useful…
…but others trip over the inconsistency from time to time.
This behavior is handy, though: no matter how long the string
text is, the expression
text[1:3] is always legal, but its value may be zero, one, or two characters long.
Have a look at these examples and make sure you understand why each one has the value it does.
To be consistent, the expression
text[1:1] is always the empty string.
Because going from location 1, up to but not including location 1, is an empty range.
Carrying on, if the lower bound is greater than the upper bound, that's an empty string too.
Not the reverse of the string that would be selected if the bounds were reversed.
However, when we compare bounds, we have to remember that negative indices count backward:
text[1:-1] is everything except the first and last characters of the string, because the index -1 means "the next-to-last legal index of the sequence". It may look odd, but it's consistent.
The other thing that's important to remember about slicing is that it always creates a new object.
But only the thing being sliced is copied, so aliasing is still possible.
Here's an example: the list
points has four element, each of which is a reference to a two-element list.
points[1:-1] creates a new list with two elements. Those two elements are references to the second and third of the sublists that
points referred to.
If we change the content of those sublists…
…by reaching through
…then not only is
points appears to change as well.
Let's have another look at what just happened. Here's how
points is laid out in memory.
When we slice it to create
middle, Python copies the second and third elements of
points, but those values are references to other lists.
middle now both contain references to those sublists.
If we overwrite
middle, the changes are shared by