Hello, and welcome to the second episode of the Software Carpentry lecture on the Unix shell. In this episode, we'll have a look at how files and directories are organized, and how to navigate around them.
As we said in the last episode, a computer has four main jobs: run programs, store data, communicate with other computers, and interact with us.
One way the computer can interact with us is through a command shell: we type commands, the shell tells the computer to run programs on our behalf, and then the shell shows us the output from those programs.
Some of the commands we will use most often are ones related to storing data on disk.
The subsystem reponsible for this is called the file system.
It organizes our data into files, which hold information…
…and directories, which hold files or other directories.
In the next few minutes, we'll see how we can use the shell to view what's in the file system.
Or to be more precise, how we can use the shell to run other programs that will show us what's in the file system.
Let's start by logging in to the computer.
Here, we're showing the shell's prompt in bold.
And explanatory text (like this message) in blue.
Type our user ID—we'll show user input in green.
And then our password. Most systems will print stars to obscure it, or nothing at all, in case some evildoer is shoulder surfing behind us.
Once we have logged in, we'll see a shell prompt, which is usually just a dollar sign (but which may show extra information, like our user ID).
The shell prompt is exactly like Python's
>>> prompt: it signals that the shell is waiting for us to type something in.
Type "whoami", followed by "enter". This command prints out the ID of the current user, i.e., shows us who the shell thinks we are.
When we enter it, the shell finds a program called
…displays its output…
…and then displays a new prompt, telling us that it's ready for more commands.
Now that we know who we are, we can find out where we are using
pwd, which stands for "print working directory".
This is our current default directory, i.e., the directory the computer assumes we want to use unless we specify something else explicitly.
The computer's response is
/users/vlad. To understand what this means, let's have a look at how the file system as a whole is organized.
At the very top of the file system is a directory called the root directory that holds everything else the computer is storing.
When we want to refer to it, we just use a slash character
This is the leading slash in
Inside that directory (or underneath it, if you're drawing a tree) are several other directories, such as
bin (which is where some built-in programs are stored)…
users (where users' personal directories are located)…
tmp (for temporary files that don't need to be stored long-term), and so on.
We know that our current working directory,
/users/vlad, is stored inside
/users is the first part of its name. Similarly, we know that
/users is stored inside the root directory
/ because its name begins with
/users, we find one directory for each user with an account on this machine. The mummy's files are stored in
/users/imhotep, the Wolfman's in
…and ours in
…which is why
vlad is the last part of the directory's name.
Notice, by the way, that there are two meanings for the
/ character. When it appears at the front of a file or directory name, it refers to the root directory. When it appears inside a name, it's just a separator.
Let's see what's inside Vlad's home directory by running
ls, which stands for "listing".
It's not a particularly memorable name, but as we'll see, many others are unfortunately even more cryptic.
ls prints the names of all the files and directories in the current directory in alphabetical order, arranged neatly into columns.
To make its output more comprehensible, we can give it the argument, or flag,
ls to add a trailing
/ to the names of directories. As you can see, there are seven of these. The names without slashes—
solar.pdf—are plain old files.
Here's that output again, with a picture of what it's showing us.
You may have noticed that the files' names are all something dot something. By convention, the second part, called the filename extension, indicates what type of data the file holds.
.txt signals a plain text file,
.cfg is a configuration file full of parameters for some program or other, and so on.
However, this is only a convention, and not a guarantee. Files contain bytes, nothing more; it's up to us and our programs to interpret those bytes according to the rules for PDF documents, images, and so on.
Now let's run the command
ls -F data, which tells
ls to give us a listing of what's in our
The output shows us that there are four text files and two directories. This hierarchical organization helps us keep our work organized.
Notice while we're here how we spelled the directory name
data. Since it doesn't begin with a slash, it's a relative path…
…i.e., it's interpreted relative to the current working directory.
If we run
ls -F /data, we get a different answer…
/data is an absolute path.
/ tells the computer to follow the path from the root of the filesystem…
…so it always refers to exactly one directory, no matter where we are when we run the command.
What if we want to change our current working directory?
pwd shows us that we're still "in"
ls without any arguments shows us its contents.
We can use
cd followed by a directory name to change our working directory.
cd stands for "change directory"…
…which is a bit misleading: the command doesn't change the directory…
…it changes the shell's idea of what directory we are in.
cd doesn't print anything, but if we run
pwd after it, we can see that we are now "in"
If we run
ls without arguments now, it lists the contents of
…because that's where we now are.
OK, we can go down the directory tree: how do we go up? If we're still in
…we can use
cd .. to up one level.
.. is a special directory name meaning "the directory containing this one".
Or more succinctly, the parent of the current directory.
Sure enough, if we run
pwd after running
cd .., we're back in
The special directory
.. doesn't usually show up when we run
If we add the
-a flag, though, it will be displayed.
-a stands for "show all".
ls to show us directory names that begin with
., such as
(which, if we're in
/users/vlad, points to the
and also another special directory that's just called
., which is the directory we're currently in. It may seem redundant to have a name for where we are, but we'll see some uses for it in later episodes.
Everything we have seen so far works on Unix and its descendents, such as Linux and Mac OS X. Things are a bit different on Windows.
Here's a typical directory path on a Windows 7 machine.
The first part,
C:, is a drive letter. This notation dates back to the days of floppy drives…
…and even today, each drive is a completely separate filesystem.
Instead of a forward slash, Windows uses backslash to separate the names in a path.
This causes headaches because Unix uses backslash to escape special characters. For example, if you want to put a space in a filename, you would write it as
\ (backslash followed by space). Please don't ever do this, though: if you put spaces, question marks, and other special characters in filenames on Unix, you're likely to confuse the shell and a lot of other tools.
Finally, Windows filenames and directory names are case insensitive: upper and lower case letters mean the same thing.
This means that the path name
C:\Users\Vlad could be spelled in 1024 different ways. Some people argue that this is more natural—after all, "VLAD" in all upper case and "Vlad" spelled normally refer to the same person—but it does cause some headaches for programmers, and can be difficult for people whose first language doesn't use a cased alphabet to understand.
The Cygwin package tries to make Windows paths look more like Unix paths by allowing us to refer to the C drive as
/cygdrive/c/ instead of as
C: (although the latter does usually work too).
It also allows us to use forward slash instead of backslash as a separator.
But paths are still case insensitive…
…which means that if you try to copy files called
backup.txt (in all lower case) and
Backup.txt (with a capital 'B') into the same directory, the second will overwrite the first.
To summarize, here are the three commands, and two special directory names, that we saw in this episode.
In the next episode, we'll see how to create, rename, and delete files and directories.