SparkBuild - build optimisation

Jan 14, 2010 • Julian Simpson

This is a guest post from Scott Castle of Electric Cloud. I've been wanting to get these guys on the blog for a while.

Scott has 5 USB drives full of Electric Cloud software, videos, docs and more to give away. All you need to do is tweet the #sparkbuild hashtag, and 5 lucky people will be chosen at random to get one sent out.

Take it away, Scott!

I hate doing laundry (which is ironic because I love clean clothes). My problem isn't with the bleach smell, or the crap-quality washing machines at the laundromat, or even having to fold everything after; my problem is that it's not a mindless task. I've got to:

sort all the clothes into washer-size batches - bright colors, darks, whites, hot and cold water fabrics, delicates
count the quarters on hand,
decide which batches will be dried all the way and which only need a half-cycle,
make a plan about how many loads and in which order to maximize throughput and minimize quarter use...
you get the picture.

This is not an automatic task. It takes a lot of brain power to plan the logistics of it all. But, hello? It's laundry. As a programmer, this is not how I want to spend my time! I'd like to be able to dump all the clothes into the washer, and get clean ones out a little later, and not have to think about this ever again.

I also hate manually compiling code and I take as many shortcuts as I can get away with. Nobody does full builds unless they're in the release group, but I don't do full incrementals either (going to the top-level of the code base and typing 'make all'); in the time it takes to parse every makefile and build everyone's changes, I could have gotten in a load of towels, at least. So I, and I'm betting you, if you're a programmer, go to every directory where I know I have a prerequisite, and 'make all' there, then build my own changes. This is much faster than waiting for a full incremental, but it makes me think of laundry, all that sorting and planning and fluffing and folding...

A colleague of mine has the same frustration and, being a better coder than me, wrote a solution. It turns out that if you collect a little data when a full build is run, you can use that data as a map to calculate something he calls a 'subbuild' - the critical path of prerequisites needed for the target I want to build, and where to find the rule to make each one of them.

I know what you're thinking: that's just an incremental! If I had a single make instance, that would be true (and I'd have to parse and evaluate the whole makefile, every time I ran make) but I'm working on a code base which uses recursive make, so I can't just go to the top and say 'make mycomponent.exe'. The subbuild technique makes a recursive make structure operate as if it was a single make instance, and that is great because now I don't have to decide, each and every time, which components to build before I compile my own code.

My colleagues have coded this technique up into a tool that works with GNU Make (3.80 and 3.81) and NMAKE (7 and 8), and we've released it as a free tool; you can try it yourself at www.sparkbuild.com. And if you're interested in more technical information about make, subbuilds, and dependency trees, check out this post.

Now, if only we could write something to do my laundry...

Image thanks to AlexJReid. Disclaimer: I'm getting no kickbacks for this.