Hello, and welcome to the fifth episode of the Software Carpentry lecture on Make. In this episode, we'll see how to define macros and include one Makefile in another to handle differences between machines.
As in previous episodes, we are exploring how to manage tasks and dependencies automatically.
We have written a Makefile that will automatically re-create the paper we're working on if any of our raw data files change.
Just when we thought we were done, our supervisor reminded us that all papers must conform to the university's new style rules.
That means that paper.pdf
has one more dependency: the official university style file euphoric.wps
.
Here's the problem: on our laptop, that file lives in C:\papers
.
On the machine we use in the lab, though, it's in /lib/styles
. Now, we could create a directory called /lib/styles
on our laptop, and put a copy of euphoric.wps
there, but there's another problem coming up fast behind us here on the highway of life.
The university also has a style guide for diagrams, which is in a file called euphoric.fig
.
Once again, on our laptop, it's installed in C:\papers
…
…but it's in /lib/styles
in the lab. How should we handle this difference?
Let's start with the Makefile we've written so far.
The brute-force approach is to just add the two new dependencies like this.
As you can see, though, there's some redundancy here: we're specifying the same directory twice.
And notice before we go on that we haven't explicitly listed euphoric.wps
or euphoric.fig
as prerequisites of paper.pdf
or the two figure we're generating. Some people would include them, just in case, but it's more common not to list dependencies on "system" files.
But back to our problem: how do we handle the fact that these two paths need to be different when we're re-creating our paper in the lab?
The first option is to use copy and paste, and write two completely separate Makefiles.
What we really mean, though, is write and maintain, and that's why this is a bad idea. As soon as we have two of anything, we'll eventually update one but forget to update the other. Makefiles are already hard enough to debug; any "solution" that adds more complexity and risk isn't really a solution at all.
Our second option is to put everything in one Makefile, and then to comment out the bits intended for the machine we aren't on.
This is problematic too. First, we have to make sure we always comment and uncomment lines consistently—if we uncomment the line for creating the paper on our laptop, for example, but forget to uncomment the line for building the figures, we're going to have another debugging headache.
Commenting and uncommenting lines also makes life more difficult for our version control system. If we update our Makefile from version control, then change the commenting on a few lines, the version control system will want to save those changes in the repository the next time we commit. We probably don't actually want to do that, since it would mean that the next time we updated on the other machine, its Makefile would be overwritten.
The third option—the right one—is to refactor our Makefile to make the problem go away entirely.
We can do this by defining a macro, just as we would define a constant or variable in a program.
Here's our Makefile with a macro defined and used.
The definition looks like definitions in most programming languages: the macro is called STYLE_DIR
, and its value is c:/papers/
.
To use the macro, we put a dollar sign in front of it (just as we would do in the shell) and wrap its name in curly brackets. This tells Make to insert the macro's value, so that these two directory paths are what we want on our laptop.
This is certainly a step forward: now, when we want to move our Makefile from one machine to another, we only have to change one definition in one place.
We no longer have to worry about consistency…
…but we're still making changes to a file that's under version control that we don't want written back to the repository.
Before we look at the second half of the solution, it's important to note that we have to put curly brackets or parentheses around a macro's name when we use it—we can't just write $MACRO
.
If we do, Make will interpret it as $M
(a reference to the macro M
) followed by "ACRO".
Since we probably don't have a macro called M
, $M
will expand to the empty string, so $MACRO
without parentheses will just be "ACRO".
This is almost certainly not what we want…
Why does Make think that macro names are only one letter long unless we tell it they're longer? To make a long story short, it's another wart left over from its history. Almost everyone trips over it occasionally, and as with other bugs, it can be very hard to track down.
Back to our Makefile… It's common practice to use macros to define all the flags that tools need, so that if a tool is invoked in two or more actions, it's passed a consistent set of flags.
Here, for example, we're defining STYLE_DIR
to point to the directory holding our style files, then using that definition in two other macros.
The first, WPD2PDF_FLAGS
, is the single flag and argument we want to pass to the tool that turns our word processor file into a PDF.
The second, SGR_FLAGS
, combines STYLE_DIR
with a couple of other flags to build the arguments for the tool that turns data files into SVG diagrams.
And now we're ready to solve our original problem. Let's move the definition of STYLE_DIR
—the macro that changes from machine to machine—out of our main Makefile, and into a Makefile of its own called config.mk
.
We can then include that file in our main Makefile using Make's include
command.
Our other macros and commands can then use the definition of STYLE_DIR
just as if it had been defined in the main Makefile.
Once we've tested this to make sure it works, we can copy config.mk
to create two files that we'll put in version control.
The first, config-home.mk
, defines STYLE_DIR
for use on our laptop.
The second, config-lab.mk
, defines it for use in the lab.
As we said, these two files go in version control, and are only changed when they need to be (i.e., when the style files move, or their names change).
We then copy one or the other on the machine we're using to create the file config.mk
that our main Makefile actually includes.
For example, here's what we have in the paper
directory on our home machine when we do a fresh checkout from version control: along with our data files and the word processor file, we have our main Makefile and the two machine-specific configuration makefiles.
So we copy config-home.mk
to create config.mk
.
Meanwhile, when we check out in the lab…
…we copy config-lab.mk
to create config.mk
.
In both cases, our main Makefile is now happy, because the file it's including now exists, and has the right definition of STYLE_DIR
.
We can also solve this problem by defining STYLE_DIR
on the command line each time we run Make.
To do this, we use the -D
flag, and specify the macro's name and the value we want to give it.
This is almost always a bad idea, though.
We have to remember to type the definition each time…
…and we have to type it correctly each time. This isn't too bad with just one definition, but if there are half a dozen, well, you see the problem.
There's also no record in the Makefile itself of the flag, which makes life harder for other people who want to re-create our paper. How do they know what to type?
There are many other approaches to handling platform dependence in builds that we won't go into in this lecture.
One of the most popular, which is used by tools like CMake and GNU's Autoconf and Automake, is to write a higher-level specification for the build that can then be compiled to create Makefiles or build files for other tools like integrated development environments.
The main benefit of doing this is that these tools can manage, and even automatically discover, the difference between machines, so that we don't ever have to worry about them.
The downside is that these higher-level build files are even harder to debug than Makefiles.
Remember: a build file is a program. Automating tasks with build files can save you endless hours…
…but you have to treat them with the same respect you would give any other program.
In our next episode, we'll have a quick look at how automated builds can be used to support reproducible research.