mdub@DogBiscuit.org
... mmm, crunchy!
about - weblog - software - resume - email - pgp

Attacking slow continuous-integration builds (notes from CITCON)

Last weekend I went along to CITCON here in Melbourne. Which was great fun, by the way.

There I ran a session on "Attacking slow CI builds". It was a small group, but an interesting discussion, I think. Here are my (rough, unedited) notes:

WHAT is the impact of a slow build?

  • fewer checkins
  • more waiting
  • context switching
  • discourages integration
  • discourages writing of additional tests
  • more chance of overlapping checkins
  • more build breakages
  • more time required to get the build fixed
  • reduced productivity
  • WASTE!

WHY is the build slow?

  • slow tests (particularly acceptance tests)
    • over-testing (testing the same code-paths repeatedly)
    • expensive set-up and tear-down
    • too much testing via the user-interface
    • tests that pause, sleep, or poll (e.g. to deal with AJAX)
  • too much I/O!
  • use of slow infrastructure components (database servers, application servers, etc.)
  • slow hardware

HOW can we make it faster?

  • faster hardware
  • run tests in parallel
  • distribute tests
  • fail fast
    • selective testing: run tests most likely to fail first
      • could use dependency-analysis to identify which tests were affected by recent commits
  • refactor story-based acceptance tests into scenario-based tests
    • bigger tests, with more assertions, offsets set-up/tear-down costs
      • but makes tests more complex
  • share test fixtures between a group of tests
    • but breaks test isolation
  • avoid I/O
    • in-memory database
    • in-memory file-store (RAM disk?)
    • stub out infrastructure components
      • avoid testing these components by side-effect
  • populate the database directly, rather than using the user-interface to set-up for a test
  • separate your system into components that can be tested independently

Thinking about this later ...

There are two types ...

The suggestions for improving build times seemed to fall into two categories:

  1. optimise the build/tests
  2. throw additional hardware at the problem

My problem with the "throw hardware at it" approach is that it typically only helps for the continuous-integration server; the poor old developers are still left with a slow-running build, and therefore many of the productivity issues still exist.

Another idea

It occurs to me now that we missed a fairly fundamental trick to improve test times: improve the performance of the system-under-test itself. It's a great excuse to start thinking about performance earlier in the project.

"Customer Acceptance Test" does not need to mean end-to-end

On all the projects I've been on in recent years, we've ended up with the majority of the tests being either "developer unit tests", which run super-fast, or "customer acceptance tests" which test end-to-end (browser-to-database) and run super-slow.

Methinks it should be less black-and-white. If we can demonstrate functionality that the customer cares about by calling the underlying logic directly (i.e. at unit-test level), rather than by exercising the user-interface, then what's wrong with that? (We just need one test to prove that the underlying logic has been properly integrated into the UI.)

Railsconf 2008 Highlights

I was lucky enough to be at Railsconf 2008 in Portland last weekend (along with Marty, Rob, Trav and Abhi).

Highlights

  • Meeting other Ruby/Rails enthusiasts from all over. (Well, all over the US, at least).
  • Joel Spolsky's opening keynote was hilarious (in a good way). Some other commentators found it low on content, but I thought it had a strong message: usability matters!
  • Seeing Kent Beck present was fantastic. He had the audience hanging on his every word, as he described how "anything he'd done had taken 20 years to have an impact".
  • Ezra's talk on Vertebra, his XMPP-based "cloud control" project, was fascinating. What a great abuse of technology!
  • The JRuby and Rubinius teams are co-operating closely, in a spirit of friendly, respectful rivalry. Particularly notable is their effort to collaborate (with each other, and Matz) on a rigourous set of executable specs for Ruby language.
  • The upcoming version of Phusion Passenger will support not only Rails applications, but also Rack (and therefore Merb, Sinatra, Camping), and (get this) WSGI (and therefore a bunch of Python frameworks, including Django)!
  • There are increasingly varied options for deploying Rails apps, including the traditional {Apache,nginx}+{mongrel,thin}, JRuby WARs in a servlet container, Passenger, and the Amazon-EC2-based services like RightScale and Heroku. Heroku's deployment model is pretty damn clever: just "git push".

Regrets

With 4 streams going on, the talks I got to were naturally out-numbered by those I missed. Some of the ones I really wish I'd seen include:

  • MagLev: Gemstone's Ruby implementation-in-progress, based on their Smalltalk VM
  • Scott Chacon on "Using Git" (apparently he went into mind-bending detail of the Git internals)
  • Justin Gehtland's "Small Things, Loosely Joined, and Written Fast"

Git (on the Mac)

Git is the hype. I'm just starting to use it for a couple of projects, both directly, and as a local facade to Subversion.

Here are some suggestions on using git under Mac OS X.

Installation

Installation using MacPorts is pretty painless. Ensure you choose the "svn" variant if you want Git/Subversion integration.

sudo port install git +svn +doc

Another option is the native installer, available at http://code.google.com/p/git-osx-installer/

Textmate

If you use Textmate, the Git Textmate bundle is rather nice.

cd ~/Library/Application\ Support/TextMate/Bundles
git clone git://gitorious.org/git-tmbundle/mainline.git Git.tmbundle

Remember to set the TM_GIT variable (to "/opt/local/bin/git" or "/usr/local/bin/git", as the case may be), otherwise stuff won't work.

Shell completion

For command-line (bash) users, there's TAB-completion available, which is pretty handy. I'm using it directly from my local clone of the git source tree, like this:

# in .bashrc ...

git_completion_script=$HOME/OpenSource/kernel.org/git/contrib/completion/git-completion.bash
if test -f $git_completion_script; then
  source $git_completion_script
fi

GitNub for history browsing

GitNub is a sweet little UI for browsing history of git commits.

Using Git

So far, I haven't talked at all about how you actually USE the thing, and don't intend to, since there are already so many great resources out there on the subject. Some I've found useful are:

ReadOnlyFormBuilder

For RubyOnRails developers, form_for and fields_for are the accepted way of DRYing up form templates. You know the deal; you code

<% form_for :customer, :url => customers_path() do |customer_form| %>
  <p>
    <label>Name:</label> 
    <%= customer_form.text_field :first_name, :size => 15 %>
    <%= customer_form.text_field :last_name, :size => 20 %>
  </p>
  ... etc ...
<% end %>

and you get

<form action="/customers" method="post">
  <p>
    <label>Name:</label> 
    <input id="customer_first_name" name="customer[first_name]" size="15" type="text" />
    <input id="customer_last_name" name="customer[last_name]" size="20" type="text" value="" />
  </p>
  ... etc ...
</form>

Rails generates sensible field names and ids for you, and slurps existing values out of the model object. So far, so good.

Lately, I've taken to using the same trick when presenting data, not just when editing it. So, whereas before I might have written:

  <p>
    <label>Name:</label> 
    <span id="customer_first_name"><%= h @customer.first_name %></span>
    <span id="customer_last_name"><%= h @customer.last_name %></span>
  </p>
  ... etc ...

I'll now code it up as:

<% fields_for :customer, :builder => ReadOnlyFormBuilder do |customer_form| %>
  <p>
    <label>Name:</label> 
    <%= customer_form.text_field :first_name, :size => 15 %>
    <%= customer_form.text_field :last_name, :size => 20 %>
  </p>
  ... etc ...
<% end %>

and get the same output. (In case you're wondering, the ids are there to help with automated testing).

Note the similarity between the last code snippet and the first one on this page; apart from the first line they're indentical. Usually, I'll put the field-declarations themselves in a partial that's shared between "new", "edit" and "show" actions. That way, your "show" page automatically gets identical layout to the others, just with raw values in place of editable fields.

The ReadOnlyFormBuilder class itself it fairly straightforward - I'm planning to wrap it up into a plugin sometime soon. In the meantime, the implementation of text_field looks something like this:

def text_field(attribute, options={})
  content_tag("span", html_escape(value_of(attribute)), :id => "#{@object_name}_#{attribute}")
end

def value_of(attribute)
  value = model.send(attribute)
end

def model
  @object || @template.instance_variable_get("@#{@object_name}")
end

Faster project-wide searching in Textmate

Textmate is a nice editor, but it's "Find in Project" (⇧⌘F) function is annoying slow in large projects.

So, I'm happy to have found an alternative: GrepInProject++, which leverages the raw power of find(1) and grep(1) for super fast searching. The original GrepInProject was created by Henrik Nyh; Robert Thurnher added a better UI and some other features.

Enjoy it thusly:

$ sudo mv GrepInProjectSearch.nib /Applications/TextMate.app/Contents/SharedSupport/Support/nibs/
  • Open GrepInProject.tmCommand with Textmate

By default the "Grep in Project" command is bound to ⇧⌘F, replacing the built-in "Find in Project" command.

Note: my version is ever-so-slightly different to Robert's; I changed the find to use name rather than path matching (saving a few precious milliseconds), and removed a redundant "recursive" option from the grep.

Rake profiling

Where's the bottleneck in your Rake build? Let's find out. Drop (or include) this in your Rakefile:

module Rake
  class Task
    def execute_with_timestamps(*args)
      start = Time.now
      execute_without_timestamps(*args)
      execution_time_in_seconds = Time.now - start
      printf("** %s took %.1f seconds\n", name, execution_time_in_seconds)
    end
    
    alias :execute_without_timestamps :execute
    alias :execute :execute_with_timestamps 
  end
end

How I Learning to Stop Worrying and Love the Mac

In my new job, a Mac is the preferred tool of the trade. So now I'm learning to use a nice shiny MacBook Pro, and after years developing on Windoze, it's a very pleasant experience. Here are some of the things that are making my life just that little bit more delightful:

  • It's Unix. On Windoze, Cygwin helped a little, but this is soooo much better.
  • Launchbar - an application launcher and more. I bought this almost immediately after getting my Mac, based on some random recommendation somewhere, and haven't regretted it - I probably use it once every 5 minutes, on average. I've subsequently tried Quicksilver, but it didn't feel immediately "right" in the same way Launchbar does.
  • Textmate - "the missing editor". I've been a dedicated Emacs user for about 20 years now, but had to give Textmate a try, given all the hype. It's noice! Many of the features I know and love from Emacs are there (albeit bound to different cryptic key-combinations), and the UI is clean and Mac-savvy.
  • 1Password - a password manager. I got this handy little utility as part of a bundle from macheist.com. It stores all your passwords in a (secure) keychain, indexed by website, making it really easy to log back in next time you visit. Best of all, it integrates with most browsers, meaning you only need to store passwords once. It can store multiple sets of login details per site, too, which is very useful when testing web-apps.
  • Safari is a nice little web browser, and once you turn on the debug menu, it's even better. Normally, I'd reach for Firebug for this sort of functionality, but the built-in Safari equivalent is almost as good.

Diving (thoughtlessly) back into the workforce

As of last week, I'm developing software for money again, after a nine month break.

At the same time, I'm saying goodbye to ThoughtWorks, which is not easy, since I've really enjoyed my time there. I joined TW to hang out with good people, and wasn't disappointed: the faces have changed a little over the years (as people come and go), but TW still employs some of the most talented and passionate people I've ever had the opportunity to work with.

In the end, though, I figured it was time to try something new. I'm now working for Cogent Consulting. I've known Steve and Marty, who run the company, since the early days of the Melbourne eXtreme Programming group, and have a lot of respect for them both. They've started to assemble a very interesting, talented bunch of individuals, and I'm looking forward to the ride.

Among other things, I'm going to be doing a whole lot more Ruby and Rails work than I have to-date. Which feels good, since I've been blathering about Ruby for a few years now.

Nine months at home

Since Easter last year, I've been a house-husband. My beautiful wife (Tanya) was keen to get back to work, and we didn't want our kids (Ngara and Jonah) in full-time care, so I've been home playing Dad. And it's been great.

Lots of people have said how jealous they were of "my time off", like it was some kind of holiday. I have to say, it hasn't felt like a holiday, by any stretch. Looking after kids can be hard work, 13+ hours/day, 7 days/week. Being a introverted control-freak, I found it challenging. Developing software is easier and more relaxing.

I imagined it would be extremely difficult at first. But after a few months, I figured, I'll get good at it, and then it will be relatively easy. Well, perhaps not easy, but manageable. As it turned out, I actually found pretty manageable from the outset - not quite as hard as I'd imagined, and the kids were surprisingly accepting, once they'd got their heads around it. On the other hand, it hasn't got any easier whatsoever over time; I still struggle with exactly the same challenges now, on a daily basis: how to keep them fed, entertained, safe, and below 100dB.

What a great experience, though. I'm particularly thankful for the time I've been able to spend with Jonah (almost 2). When I stopped work he'd just turned one, and for his first year of life I hardly saw him, given that I was away for most of his waking hours. So it's been really special to be around him this last year, as he's transformed from a carpet-bound nappy-filler into a real little boy. My little boy. First steps, first words, first trip to the emergency department ... all that good stuff.

It was a different story with Ngara (almost 5), though. Previously, I was her "fun" Dad, who got to read her books at bedtime, and play with her in the weekends. Instead, once the novelty of me being home wore off, I became the responsible parent, too busy cleaning the kitchen or changing nappies to be much fun, much of the time. I think it's been good for us both to spend more time together, but I miss being the "fun" one.

Anyway, this little interlude in our lives is about to come to an end, with me about to return to work. I'm very thankful that it's been possible for me to spend this time at home. But in many ways, I'm also glad it's over :-)

Jonahisms

My son Jonah is just starting to talk in earnest. I've forgotten most of his elder sister's cuter sayings, so am making a point of jotting down some of his:

noh
no (he's nearing 2, so starting to use this fluently)
noooooOOOhohoho
definitely not
mmmmm
well, okay then
moh!
more!
Mum-mah
his mum
Dad-dee
that's me
Rah-rah
Ngara (his elder sister)
tar
car
tuck
truck
dut
duck
Dut! Dut! Dut!
Quack! Quack! Quack!
neigh
horse
baap
sheep
bup
bug
ay o
elephant (from "Hey De Ho", a children's song)
rahh
tiger, lion
mao
cat
olly daaay
holiday
Oh oh ooooh
"Ho ho ho" (used for anything related to Christmas)
bubbish tuck!
rubbish truck!
moh bubbish tuck?
where did the rubbish truck go?
bubba dut
baby duck (always difficult to distinguish from a rubbish truck)
ut si
outside (his favourite place)
i sti
swing me "high in the sky"
Moh i sti!
C'mon, higher!
bah bye
bye bye (usually delivered with a heartbreaking little wave)

He's saying a lot of other things too, but most of them are completely unintelligible at this point. :-)