Find us on GitHub

Teaching basic lab skills
for research computing

Databases: Filtering

Databases/Filtering at YouTube

Welcome to the Software Carpentry screencast on Databases. This screencast is on filtering results from a databse.

As we've seen in a previous screencast, we can use the SELECT statement to pull out columns from a table, and add computed values to results.

One of the most powerful features of a database is the ability to filter through your data to find the records that match certain criteria.

Let's say we wanted to see all of the Experiments before the year 1990.

To do this we'd write a SELECT query with all of the columns from the Experiments table,

SELECT * FROM Experiment

and we use the WHERE command to specify our filter conditions


We put the expression, ExperimentDate < '1990-01-01'

SELECT * FROM Experiment WHERE ExperimentDate < '1900-01-01'

This specifies that we want only experiment info with an experiment date before January 1st, 1990.

You can think about how this query works as the database inspecting each row from the Experiments table, and checking it against the condition in the WHERE clause. If the condition holds true, then that row is included in the results, otherwise it is filtered out.

We can use other comparison operators in our filtering conditions. For example, we could ask for all of the Experiments in table that took 3 or more hours. To do this we change the WHERE condition to be Hours >= 3

SELECT * FROM Experiment WHERE Hours >= 3

We can use most of the familiar comparision operators: equal to, not equal to, greater than, less than, and so on.

We can make our WHERE conditions even more sophisticated by using logical operators like AND and OR to combine conditions.

For instance, if we want to find all of the experiments run by mlom that took more than three hours, we can modify our query. It already filters for experiments that take more than 3 hours. We can add AND and then the other condition that needs to be met, that the loginID is mlom:

... WHERE Hours >= 3 AND LoginID = 'mlom'

When using an AND operator, each record returned now must satisfy two conditions: in this case, the Hours spent has to be greater than 3, and the loginID must be mlom.

If we wanted experiments either by mlom or by best, we would write

... WHERE Hours >= 3 AND (LoginID = 'mlom' OR LoginID = 'best')

The parentheses ensure that the OR-clause is evaluated first, so it is true whether the loginID is mlom or best, and the entire WHERE condition is true if the OR-clause is true AND the hours spent are greater than 3.

Another way to write this OR-clause is to use the IN operator.

With this operator, we can write a condition that is true if a field—the loginID in this case—contains anything from a list of values:

WHERE Hours >= 3 AND LoginID IN ('mlom', 'best')

This is exactly the same as writing out the OR clause, but it's shorter to write, and easier to read.

In this screencast we've shown that you can select data from a table and then filter those rows based on TRUE and FALSE conditions using a WHERE clause.