Find us on GitHub

Teaching basic lab skills
for research computing

The Unix Shell: Permissions

The Unix Shell/Permissions at YouTube

Hello, and welcome to the fifth episode of the Software Carpentry lecture on the Unix shell. In this episode, we'll have a look at the tools Unix gives you to control who has access to what.

In the previous episodes in this lecture, we looked at how to use a command shell to interact with a computer…

…and met a few commonly-used commands, such as pwd, mkdir, and cp.

We also met the wildcard character *

…and saw how to redirect output with > and create pipes with |.

It's now time to look at how Unix determines who can see the contents of which files.

And at how it controls who can change those files…

…and run particular programs.

We're going to skip over a lot of the details, and give a simplified overview.

We'll also defer discussion of how Windows manages permissions until the end of the episode—the concepts are similar…

…but its rules are different, and unfortunately there's no exact mapping between its rules and Unix's.

Let's start with a single user.

She has a unique user name and user ID.

Her user name is textual…

…and her user ID is an integer. It might seem redundant to have this as well as her user name, but integers are easier for computers to work with.

Computers also manage groups.

Each group has a unique group name and numeric group ID.

The system administrator (or anyone with equally godlike powers) can put a user in any number of groups.

The list of who's in what group is usually stored in the file /etc/group. If you're in front of a Unix machine right now, or are using Windows and have Cygwin installed, take a moment and have a look at that file.

The third part of the Unix user model is called "all".

It's "everyone else", i.e., everyone who isn't the user we're currently concerned with, or a member of any of the groups we're considering.

Now let's look at files (and directories as well, of course). Each file stores the user ID of its owner, and the group ID of its owning group as well. This means that every user on the system falls into one of three categories: the owner of the file, someone else who is in the file's group, and everyone who doesn't fit into the first two categories. For each of these three categories, the computer keeps track of:

whether people in that category can read the file,

whether they can write to it (i.e., modify the file), and

whether they can execute it, i.e. run it if it is a program.

For example, one file's permissions might be on or off as shown in this table. This means that:

the file's owner can read and write it, but not run it.

other people in the file's owning group can read it, but not modify it, and

nobody else can do anything with it at all.

Let's have a look at this model in action. If we cd into the labs directory, ls shows us that it contains three things: safety.txt, setup, and waiver.txt.

If we run ls -F, it puts a * at the end of setup's name.

This is its way of telling us that setup is executable, i.e., that it's a program of some kind that we can run.

Now let's run the command ls -l. The -l flag tells ls to give us a long-form listing. It's a lot of information, so let's go through the columns in turn.

On the right side, we have the files' and directories' names.

Next to them, moving left, are the times they were last modified. Backup systems and other tools use this information in a variety of ways that we'll explore in a later lecture; you can use it right away to tell which files are younger or older than which others.

Next to the modification time is the file's size in bytes.

Next to that is the ID of the group that owns it…

…and of the user that owns it.

We'll skip over the second column for now…

…because it's the column on the left that we care about most. This shows the file's permissions, i.e., who can read, write, or execute it.

Let's expand one of those permission strings and have a closer look.

The first character tells us what type of thing this is.

A '-' means it's a regular file…

…while a 'd' means it's a directory.

The next three characters tell us what permissions the file's owner has. Here, the owner can read, write, and execute the file.

The middle triplet shows us the group's permissions. If the permission is turned off, we see a dash, so 'r-x' means "read and execute, but not write".

The final triplet shows us what everyone who isn't the file's owner, or in the file's group, can do. In this case, it's 'r-x' again, so everyone on the system can look at the file's contents and run it.

Before we go any further, let's run ls -a -l to get a long-form listing that includes directory entries that are normally hidden.

As you can see, the permissions for . and .. (this directory and its parent) start with a 'd'.

But look at the rest of their permissions. The 'x' means that "execute" is turned on.

What does "execute" mean for a directory? It's not a program: how can we "run" a directory?

In fact, 'x' means something different for directories—it gives someone the right to traverse the directory, but not to look at its contents.

The distinction is subtle, so let's have a look at an example. Vlad's home directory has three subdirectories called venus, mars, and pluto.

Each of these has a subdirectory in turn called notes

…and those sub-subdirectories contain various files.

If a user's permissions on venus are 'r-x'…

…then if she tries to see the contents of venus and venus/notes using ls

…the computer lets her see both.

If her permissions on mars are just 'r--'…

…then she is allowed to read the contents of both mars and mars/notes.

But if her permissions on pluto are only '--x'…

…she cannot see what's in the pluto directory—ls pluto will tell her she doesn't have permission to view its contents.

If she tries to look in pluto/notes, though…

…the computer will let her do that. She's allowed to go through pluto, but not to look at what's there. This trick gives people a way to make some of their directories visible to the world as a whole without opening up everything else.

So much for looking at permissions: if we want to change them, we use the chmod command. The name stands for "change mode", which once again isn't particularly memorable.

Here's a long-form listing showing the permissions on the final grades in the course Vlad is teaching.

Whoops: everyone in the world can read it.

And what's worse, modify it—a crafty student could go in and change his or her grade.

(They could also try to run the grades file if they wanted, which would almost certainly not work.)

Here's the command to change the owner's permissions to 'rw-'.

The 'u' signals that we're changing the privileges of the user (i.e., the file's owner), and 'rw' is the new set of permissions.

A quick ls -l shows us that it worked.

Let's run chmod again to give the group read-only permission and then display the results.

Notice as we race by that we've put two commands on a single line. We can do this as long as we separate them with a semi-colon.

Finally, let's give "all" (everyone on the system who isn't the file's owner or in its group) no permissions at all.

That's what "a=" means: the 'a' signals that we're changing permissions for "all", and since there's nothing on the right of the "=", "all"'s new permissions are empty.

Those are the basics of permissions on Unix. As we said at the outset, though, things work differently on Windows.

There, permissions are defined by access control lists, or ACLs.

An ACL is a list of pairs, each of which combines a "who" with a "what". For example, you could give the Mummy permission to append data to a file without giving him permission to read or delete it, and give Frankenstein permission to delete a file without being able to see what it contains.

This is more flexible that the Unix model…

…but it's also more complex to administer and understand, at least on small systems. (If you have a large computer installation, nothing is easy to administer or understand.)

Some modern variants of Unix actually support ACLs as well as the older read-write-execute permissions, but hardly anyone uses them.

Now that we understand how permissions work, it's time to start creating our own programs.

Let's start by running cat > smallest.

Since we didn't specify an input file, cat will read from the keyboard, i.e., its input will be whatever we type.

And since we put > smallest at the end of the command, the computer will send cat's output to a file called smallest. Making a long story short, this command will copy whatever we type into a file called smallest: it's like a text editor, but without the most useful bits.

Type in this line: wc -l *.pdb | sort | head -1. You may remember this as the pipe we constructed in the previous episode to find the smallest molecule file.

After pressing 'enter' to end the line, type Control-D. You should immediately get a new shell prompt.

Control-D means "end of input" in Unix: it's how we tell cat (or any other program) that there's nothing more for it coming from the keyboard.

The equivalent control character on Windows is Control-Z.

Now that our commands are in the file, let's give ourselves the right to run that file as a program by typing chmod u+x smallest.

The argument "u+x" tells chmod to add execute permission for the user without changing anything else. We can use '-' to subtract permissions as well if we want.

And now, let's run smallest by typing in its name, just as we would type in the name of any other program.

To be sure we're getting exactly the file we just created, we type ./smallest to tell the shell that we want the smallest that's in the current working directory. This guarantees that even if there's another program called smallest somewhere else on the computer, the shell will run ours.

Sure enough, if we're in the directory containing our PDB files, our little program's output is exactly what we'd get if we ran that pipeline ourselves.

Trying doing that with a bunch of GUIs on your desktop.

Thank you.