Hello and welcome to the third episode of the Software Carpentry lectures on handling directories and files in Python. Here, we'll continue to look at how we can explore directories by looking at the ways in which Python allows us to find out more about the contents of directories.
We now know how to move around directories…
…and see what's in them.
But there are many other things we might want to find out.
We might want to check whether a file or directory already exists. This can be useful before saving a file to allow the user to decide if the file is to be overwritten.
We may have a variable and want to see whether that refers to a file or a directory.
We may want to see if two variables refer to the same file or directory.
We may want to find out what we can do with a file or directory. Are we allowed to read it, to update it, or delete it?
And, we may want to get information such as the file size, who owns it and when it was last modified.
A simple check we often want to do is to see whether a file or directory exists. Python provides the exists function that does this check. This takes an argument which can be an absolute or relative path and returns true if it exists as a file or directory and false otherwise.
Let's start with a relative path to a file.
And now an absolute path.
And a relative path to a directory.
And the absolute path.
And something that does not exist.
And the absolute path to something that does not exist.
Now, let's look at telling apart files and directories. Python provides two functions, isfile and isdir, which check whether their argument is a path to a file or a directory. They can take relative or absolute paths. Let's import these.
Now, let's call isfile on a relative path to a file.
As expected, it returns true.
And, for a path to a directory.
This time, it returns false.
And for a path to something that does not exist.
It also returns false.
Isdir is much the same. When given the path to a directory…
…it returns true.
And for a file…
…it returns false.
And when given a nonexistent directory…
…again, it returns false.
As isfile and isdir are just functions that return true or false we can use them in conditionals. So here is one example where we define a simple function to print whether the path it is given is…
A file.
A directory.
Or does not exist.
And here we see it running on a path to a file. This time we use an absolute path, just for a change.
And here it is with a path to a directory.
And with a path something that does not exist.
Samefile allows us to check whether two paths point to the same file or directory. This is useful when paths are held in variables. So, let's import it.
And create some variables with file paths.
If we compare file1 and file2, which contain relative and absolute paths to the same file, then…
…we get the expected result of true.
And if we compare file1 to a different path, file3, then…
…we get false.
Before trying to perform operations on a file, for example to open it for reading or writing, to delete it, or, for files that are executable binaries, to execute it, it can be useful to check if we are allowed to do these operations. Python's access function allows us to do these checks. So let's import access.
Access takes two arguments, the path to a file or directory and a flag that specifies what access permissions we want to check. So let's import the flags. There are four.
As an example of each in turn…. F_OK allows us to check if the file or directory exists.
R_OK is for checking if we have permission to read the file or directory.
W_OK is for checking if we can edit, update, or delete it.
And, X_OK is for checking if we can execute a file.
We can combine conditions using the logical OR, vertical bar, operator. So we can check if we can both read and write a file.
Or check if we can both read and execute a file.
Or if a file exists and we can read it and write it.
It can also be useful to get operating system information about files and directories.
The stat function returns a record holding various information about a file.
The information in this record can then be accessed. This includes its protection bits.
Inode number.
Device.
Number of hard links.
Owner's user ID.
Owner's group ID.
File size in bytes.
Most recent access time. The meaning is operating system dependant.
Most recent modification time. Again, operating system dependant.
The time of the most recent change to metadata, under Linux, or creation time, under Windows.
These times may be floats or integers. You can check this by calling the stat_float_times function. Here, it says the values are integers.
The stat record may also contain operating system-specific information.
For example, for Linux this can include the number of blocks used by the file.
And the file system block size.
We've looked at a number of Python functions to find out more information about files and directories. From the os.path module we used. Exists to see if a file or directory exists. Isfile and isdir to determine whether a path specifies a file or a directory. And, samefile to see whether two paths point to the same file or directory. From the os module we used. Access to see what access permissions we have to a file or directory and determine if we can read it, write to it, delete it, or, for files, execute it. And, we used stat to get low-level operating system-specific information such as file sizes, permission bits, user and group IDs and creation and modification times.
Thank you for listening.