DogBiscuit

Cloud Foundry on AWS - Part 1: Deploying a MicroBOSH

2013-07-02T09:48:00+10:00

Having spent some time experimenting with Cloud Foundry using Pivotal's flagship deployment at run.pivotal.io, I wanted to see how easy it was to deploy Cloud Foundry myself.

The recommended deployment mechanism for Cloud Foundry uses BOSH, "an open source tool chain for release engineering, deployment, and lifecycle management of large-scale distributed services". Once it's up and running, Cloud Foundry actually has no dependencies on BOSH, and in fact, it's technically possible to deploy CF by other means. But BOSH makes it easier to manage an existing Cloud Foundry deployment, so we'll stick with that.

(A word of warning: the official documentation for "Deploying Cloud Foundry on AWS" currently describes a installation process involving the bootstrap-cf-plugin gem. I recommend avoiding that path, as it conflates installation of BOSH and Cloud Foundry in a confusing way. Let's stay focused on BOSH, for now.)

BOSH is itself a distributed system, and can be deployed across multiple nodes (e.g. EC2 instances) for better performance and resilience. For our purposes, though, it's sufficient to have all the BOSH components installed on a single node. Such a single-node deployment is called a "MicroBOSH".

The BOSH project distributes MicroBOSH as stemcells; where "stemcell" is their term for a raw machine image. They provide an AWS-specific stemcells, which can be used to build AMIs, as well as stemcells for other target infrastructures (like vSphere and OpenStack).

Before you get too excited, though, you'll need an "inception server". Despite the fancy name, this is nothing more than an EC2 instance in your target AWS region, that you can SSH into, to use as a staging point for MicroBOSH installation. It's required because:

you'll be downloading some large BLOBs, and it's best to keep them inside the AWS network
in order to turn the MicroBOSH stemcell into an AMI, you need to create a mount an EBS volume.

Any Linux instance running Ruby 1.9.x should do. If you don't have one handy, you can use one of:

Dr Nic's inception-server project
my bosh-inception-vagrant

Once you have an inception server running, SSH on in.

Option A: `bosh-bootstrap`

At this point, you're welcome to take a shortcut by way of Dr Nic's bosh-bootstrap project, which mostly automates the remainder of the process.

Option B: DIY

After some initial experiments with bosh-bootstrap, I wanted to understand more about what it was doing, so ended up building my MicroBOSH manually, using bosh-bootstrap as a guideline. If you're a sucker for punishment, like me, then read on ...

You'll need the BOSH command-line toolset. Create a Gemfile containing:

source "https://rubygems.org"
source 'https://s3.amazonaws.com/bosh-jenkins-gems/'

gem "bosh_cli", "~> 1.5.0.pre"
gem "bosh_cli_plugin_micro", "~> 1.5.0.pre"

and run "bundle install" to install the bits you need.

Now, use the AWS console (or the API) to set some stuff up:

create a keypair (e.g. "mybosh")
create a security group (e.g. "bosh") that allows inbound connections on the following ports:

22 (for SSH)
4222 (for the "nats" pub/sub protocol)
6868 (for the BOSH agent)
25250 (for the BOSH blobstore)
25555 (for the BOSH director)
25777 (for the BOSH registry)
53 (UDP, for DNS)

allocate an Elastic IP address (e.g. A.B.C.D)

Pick a name for your MicroBOSH, e.g. "mybosh", and create a configuration file for it:

$ mkdir -p ~/microbosh/deployments/mybosh
$ vi ~/microbosh/deployments/mybosh/micro_bosh.yml

Here's an example micro_bosh.yml file:

name: mybosh
logging:
  level: DEBUG
network:
  type: dynamic
  vip: A.B.C.D
resources:
  persistent_disk: 4096
  cloud_properties:
    instance_type: m1.medium
cloud:
  plugin: aws
  properties:
    aws:
      access_key_id: YOURKEY
      secret_access_key: YOURSECRET
      region: ap-southeast-2
      ec2_endpoint: ec2.ap-southeast-2.amazonaws.com
      default_security_groups:
      - bosh
      default_key_name: mybosh
      ec2_private_key: /home/ubuntu/.ssh/mybosh.pem
apply_spec:
  agent:
    blobstore:
      address: A.B.C.D
    nats:
      address: A.B.C.D
  properties:
    aws_registry:
      address: A.B.C.D

You'll need to :

insert the appropriate AWS access/secret key
change the region (unless "ap-southeast-2" is what you want)
replace "A.B.C.D" with the Elastic IP address you allocated earlier

Now, you need a MicroBOSH image to install, so download the latest MicroBOSH stemcell:

$ cd ~/microbosh
$ curl -O http://bosh-jenkins-artifacts.s3.amazonaws.com/micro-bosh-stemcell/aws/latest-micro-bosh-stemcell-aws.tgz

Now it's time to crank up the MicroBOSH!

Be patient; it takes a while. Actually, you should consider installing a terminal multiplexer like tmux on your inception server, and running this step within a tmux session, just in case you get disconnected while the deployment is in progress.

$ cd ~/microbosh/deployments
$ bosh micro deployment mybosh/
Deployment set to '/home/ubuntu/microbosh/deployments/mybosh/micro_bosh.yml'

$ bosh -n micro deploy ../latest-micro-bosh-stemcell-aws.tgz

Verifying stemcell...
File exists and readable                                     OK
Using cached manifest...
Stemcell properties                                          OK

Stemcell info
-------------
Name:    micro-bosh-stemcell
Version: 776

Deploy Micro BOSH
  unpacking stemcell (00:00:16)
  uploading stemcell (00:10:48)
  creating VM from ami-c51380ff (00:00:32)
  waiting for the agent (01:01:20)
  create disk (00:04:16)
  mount disk (00:00:06)
  stopping agent services (00:00:01)
  applying micro BOSH spec (00:00:19)
  starting agent services (00:00:00)
  waiting for the director (00:00:18)
Done             11/11 01:18:06
WARNING! Your target has been changed to `https://55.251.169.14:25555'!
Deployment set to '/home/ubuntu/microbosh/deployments/mybosh/micro_bosh.yml'
Deployed `mybosh/micro_bosh.yml' to `https://mybosh:25555', took 01:18:06 to complete

Note the id of the AMI produced; you can use this for future MicroBOSH deployments in the same region, bypassing the stemcell download and conversion processes, e.g.

$ bosh -n micro deploy ami-c51380ff

Actually, if you're really lucky, Pivotal might have already baked a MicroBOSH AMI in your target region, in which case you can just use that, and save yourself a lot of time (and network traffic).

$ AWS_REGION=us-east-1
$ curl http://bosh-jenkins-artifacts.s3.amazonaws.com/last_successful_micro-bosh-stemcell-aws_ami_$AWS_REGION
ami-427b092b

At time of writing, they're only baking AMIs for us-east-1, so for other regions you'll have to resort to downloading the stemcell, as described above.

Once your MicroBOSH is running, you should be able to connect to it:

$ bosh target https://55.251.169.14:25555
Target set to `mybosh'
Your username: admin
Enter password: *****
Logged in as `admin'

$ bosh status
Config
             /home/ubuntu/.bosh_config

Director
  Name       mybosh
  URL        https://55.251.169.14:25555
  Version    1.5.0.pre.776 (release:6191c586 bosh:6191c586)
  User       admin
  UUID       eac3cf02-845d-4817-aa55-7626a071304a
  CPI        aws
  dns        enabled (domain_name: microbosh)
  compiled_package_cache disabled
  snapshots  disabled

Deployment
  not set

Stay tuned for the next exciting episode!

Running database migrations on Cloud Foundry

2013-06-19T21:55:00+10:00

An important part of deploying a database-backed application is keeping the database schema up-to-date. For Rails applications, you typically do that by running:

$ rake db:migrate

For Rails applications running on Heroku, you typically run the migrations immediately after deploying a new version of your app, e.g.

$ git push heroku master
$ heroku run rake db:migrate

Since Cloud Foundry takes after Heroku in so many ways, I expected to use a similar workflow when deploying my Rails application on Cloud Foundry. I was suprised to discover that CF does not support any equivalent of "heroku run". That is, there's not (yet) a built-in way to run a Rake task or shell-command in the context of the currently deployed application.

The Cloud Foundry documentation suggests a somewhat surprising solution: alter the startup command for your application to execute migrations before (re)-starting the web-app. I duly did so by specifying a custom start command in the Cloud Foundry "manifest.yml" for my app:

---
applications:
- name: barfly
  command: "bundle exec rake db:migrate && bundle exec rackup -p $PORT"

We don't really want to run the migrations on every app-server instance, though. Luckily, Cloud Foundry provides meta-data to each instance, in the form of an environment variable, $VCAP_APPLICATION. Specifically, it provides an "instance_id" key, which contains the a unique number for every instance.

So, I created a Rake task to limit an action to the first instance, ie. the one with instance_id == 0.

namespace :cf do

  desc "Only run on the primary Cloud Foundry instance"
  task :on_primary_instance do
    instance_index = JSON.parse(ENV["VCAP_APPLICATION"])["instance_index"] rescue nil
    exit(0) unless instance_index == 0
  end

end

If the instance_id is non-zero, or unset, the task exits Rake early, skipping any subsequent tasks. With this in place, I altered the startup command to make use of the new task:

---
applications:
- name: barfly
  command: "bundle exec rake cf:on_primary_instance db:migrate && bundle exec rackup -p $PORT"

This actually works fairly well; the migrations run, then the application-server starts up. Of course, if the migrations fail, the app-server won't start. That's fine with me, for now, and I imagine it would be acceptable for many apps.

Others, though, might want/need tighter control over when database migrations run, rather than just running them automatically on boot. With that in mind, an alternative approach would be to handle migrations entirely separately from application deployments. Cloud Foundry provides easy access to information about database (and other external) services, e.g.

$ cf file barfly logs/env.log | grep DATABASE_URL
DATABASE_URL=postgresql://deadb33f:5Dvc0ePHMrwFUuODGQiSYWCYHU-nIzu-@babar.elephantsql.com:5432/deadb33f

Using those connection details, the deployment script could connect to and migrate the target database schema, prior to the "cf push" that updates the app. In many ways, this is preferrable to the push-migrate workflow typically used with Heroku, as it creates the flexibility to run constructive migrations independent of application deployments.

Play-ing with Cloud Foundry v2

2013-06-17T21:50:00+10:00

I'm currently messing about a bit with Cloud Foundry, an open-source Platform-as-a-Service which promises to be something akin to Heroku-in-your-data-center, or perhaps Heroku-in-your-AWS-VPC. In any case, it's exciting stuff, and appears to be moving fast, under the guidance of Pivotal.

Brian McClain posted an article recently about getting a Haskell app running in Cloud Foundry. I decided to do something similar using Play.

I hadn't used Play before, but it turns out that it's pretty easy to get a basic app up and running:

$ play new playpen

A quick check locally, to check that everything is hanging together:

$ cd playpen
$ play start
$ open http://localhost:9000

All good, so I'll try pushing it up to Cloud Foundry.

$ cf push
Name> playpen

Instances> 1

1: 64M
2: 128M
3: 256M
4: 512M
Memory Limit> 256M

Creating playpen... OK

1: playpen
2: none
Subdomain> playpen

1: cfapps.io
2: none
Domain> cfapps.io

Binding playpen.cfapps.io to playpen... OK

Create services for application?> n

Bind other services to application?> n

Save configuration?> y

Saving to manifest.yml... OK
Uploading playpen... OK
Starting playpen... OK
-----> Downloaded app package (1020K)
Installing java.
Downloading JDK...
Copying openjdk-1.7.0_21.tar.gz from the buildpack cache ...
Unpacking JDK to .jdk
/var/vcap/packages/dea_next/buildpacks/lib/buildpack.rb:63:in `start_command': Please specify a web start command in your manifest.yml or Procfile (RuntimeError)
  from (erb):6:in `generate_startup_script'
  from /usr/lib/ruby/1.9.1/erb.rb:838:in `eval'
  from /usr/lib/ruby/1.9.1/erb.rb:838:in `result'
  from /var/vcap/packages/dea_next/buildpacks/lib/staging_plugin.rb:110:in `generate_startup_script'
  from /var/vcap/packages/dea_next/buildpacks/lib/buildpack.rb:84:in `startup_script'
  from /var/vcap/packages/dea_next/buildpacks/lib/staging_plugin.rb:139:in `block in create_startup_script'
  from /var/vcap/packages/dea_next/buildpacks/lib/staging_plugin.rb:138:in `open'
  from /var/vcap/packages/dea_next/buildpacks/lib/staging_plugin.rb:138:in `create_startup_script'
  from /var/vcap/packages/dea_next/buildpacks/lib/buildpack.rb:19:in `block in stage_application'
  from /var/vcap/packages/dea_next/buildpacks/lib/buildpack.rb:12:in `chdir'
  from /var/vcap/packages/dea_next/buildpacks/lib/buildpack.rb:12:in `stage_application'
  from /var/vcap/packages/dea_next/buildpacks/bin/run:10:in `<main>'
Checking playpen...
Application failed to stage

Well, that started well, but finished badly. Cloud Foundry is assumed I was pushing a generic Java app. We need to tell it how to handle Play applications.

Buildpacks to the rescue! Buildpacks are a concept that Cloud Foundry has borrowed from Heroku - they adapt the generic PaaS to the specfics of a particular application framework and/or language. A buildpack takes your application source-code as input, and outputs a compiled package that can be run on the target PaaS. Some buildpacks are very specific to the app framework; others are more generic, and can support multiple frameworks.

It turns out that the buildpack required for a Play 2.1.1 app is heroku-buildpack-scala. This was written for Heroku, but can be used without changes on Cloud Foundry!

$ cf push --buildpack https://github.com/heroku/heroku-buildpack-scala.git
Using manifest file manifest.yml

Not applying manifest changes without --reset
See `cf diff` for more details.

Uploading playpen... OK
Changes:
  buildpack: '' -> 'https://github.com/heroku/heroku-buildpack-scala.git'
Updating playpen... OK
Stopping playpen... OK

Starting playpen... OK
-----> Downloaded app package (1020K)
Initialized empty Git repository in /tmp/buildpacks/heroku-buildpack-scala.git/.git/
Installing heroku-buildpack-scala.git.
-----> Installing OpenJDK 1.6...done
-----> Building app with sbt
-----> Running: sbt clean compile stage
       Getting net.java.dev.jna jna 3.2.3 ...
       Getting org.scala-sbt sbt 0.12.2 ...
(... etc etc ... download half the internet ...)
       [success] Total time: 3 s, completed Jun 17, 2013 5:39:55 AM
-----> Dropping ivy cache from the slug
-----> Uploading staged droplet (127M)
-----> Uploaded droplet
Checking playpen...
Staging in progress...
  0/1 instances: 1 starting
  1/1 instances: 1 running
OK

Success! That's one more toy application in the cloud!

Fun with Autotest hooks

2013-06-05T10:40:00+10:00

I've been doing more tight-feedback-loop unit-testing in Ruby recently, and find myself using autotest (from the ZenTest suite) a lot.

One of the great things about autotest is it's hook mechanism, which allows you to hang behaviour on the passing or failing of a test suite. Many people have written hooks that publish test results via a notification mechanism such as Growl. I wrote a similar one that signals the state of the test-suite via the title of the terminal window running autotest:

def set_title(title)
  if ENV["TERM"] =~ /^xterm/
    puts "\e]0;#{title}\007"
  end
end

Autotest.add_hook(:green) do
  set_title "GREEN - all passed"
end

Autotest.add_hook(:red) do |autotest|
  set_title "RED - #{autotest.files_to_test.size} failure"
end

Also, I have a tendency to checkpoint with "git add" whenever the tests pass. That's easily delegated to autotest:

Autotest.add_hook(:green) do
  checkpoint_command = "git add ."
  puts "AUTOTEST_CHECKPOINT> #{checkpoint_command}"
  system(checkpoint_command)
end

Does anyone have other useful hooks to share?

Multi-threaded processing with lazy enumerables

2013-05-23T23:30:00+10:00

A project I'm working on at the moment involves trawling a large collection (millions) of S3 objects, and I wanted to parallelize the processing across multiple threads.

As it happens, it's actually pretty easy to process a Ruby collection using multiple Threads:

inputs = [1, 2, 3, 4, 5]
threads = inputs.collect { |i| Thread.new { i * i } }
outputs = threads.collect { |thread| thread.join.value }
outputs #=> [1, 4, 9, 16, 25]

The problem with this naive approach, though, is that you end up creating a Thread for each element of the collection, all at once. For large collections, that's a bad idea. Typically you want to limit the number of Threads you having running simultaneously.

Figure 1: Thread explosion!

We can fix the do-everything-at-once problem using lazy enumeration:

require 'lazily' # or use Ruby 2

inputs = [1, 2, 3, 4, 5]
lazy_threads = inputs.lazily.collect { |i| Thread.new { i * i } } # lazy!
outputs = lazy_threads.collect { |thread| thread.join.value }
outputs.to_a #=> [1, 4, 9, 16, 25]

Okay, but now we have the opposite problem: the worker Threads aren't created until immediately before their outputs are required, so we don't get any parallelization.

Figure 2: Not really parallel

So, what if we reintroduced just a little eagerness? Let's prefetch some of the Threads in the lazy_threads collection, before we actually need them:

require 'lazily'

inputs = [1, 2, 3, 4, 5]
lazy_threads = inputs.lazily.collect { |i| Thread.new { i * i } }
prefetched_threads = lazy_threads.prefetch(2) # <- added magic
outputs = prefetched_threads.collect { |thread| thread.join.value }
outputs.to_a #=> [1, 4, 9, 16, 25]

The implementation of #prefetch is pretty straightforward; it creates a lazy buffer in front of another lazy enumerable, which keeps itself full, as it feeds elements to it's consumer.

Bingo! Now we're starting a limited number of Threads in advance of needing their outputs.

Figure 3: Antici........pation

We can size the prefetch "window" to get some parallelization, without creating an explosion of Threads. Even better, we've managed to do it without involving tricksy multi-threading operators like Mutex or Queue. And, the collection of outputs is lazy, so we can use this approach to process large (even infinite?) collections.

This "sliding window" approach to multi-threading is implemented in my new gem, Lazily, as #in_threads:

require 'lazily'

inputs = [1, 2, 3, 4, 5]
outputs = inputs.lazily.in_threads(4) { |i| i * i }
outputs.to_a #=> [1, 4, 9, 16, 25]

(Lazily is an implementation of Ruby2-like lazy Enumerables in pure Ruby. Give it a spin if you're eager to be lazy, and can't deploy to ruby-2.0.0).

The Three R's of Test Automation

2011-07-11T22:00:00+10:00

To properly assess a test's value, you need to consider both the benefit it provides, and the cost of creating, maintaining, and executing it. When the cost outweighs the benefit, the test has ceased to provide (net) value.

A have a few simple criteria I use to gauge the value provided by an automated test. Even better, I've managed to rationalise them all starting with the same letter, so now proclaim:

The Three R's of Test Automation:
An automated test must be Rapid, Reliable, and Relevant.

Rapid

All other things being equal, quick tests are better than slow tests. Quick tests add more value, because they provide feedback earlier, and can be executed more often.

Quick tests require fewer resources to execute, reducing cost. Often, the maintanance cost is also lower, as quick tests tend to be simpler and involve fewer moving parts.

Reliable

If a test fails, and that failure highlights a defect, which otherwise might have gone undetected for a while, then the test provided value. There's also sustantial value in the confidence gained when tests pass.

The benefit can quickly be eroded, though, if tests are not trustworthy. Where they are known to be deficient, then additional manual testing must be performed to gain confidence, which undermines the value of test automation.

Also, tests can turn into a major maintenance burden if they're overly specific, or too tied to implementation details, because they become fragile - breaking not because of an introduced defect, but because something unimportant changed.

Relevant

Automated tests are not inherently valuable; they are not an end unto themselves. So, what makes them valuable? The obvious answer is that they provide useful feedback ... but, what makes that feedback useful? I think feedback is only useful when it mitigates risk.

If a test fails, and that failure highlights a defect (which otherwise might have gone undetected for a while), then the test provided value. There's also sustantial value in the confidence gained when tests pass.

Automated tests are less relevant when they test functionality which:

is less valuable (e.g. the features that nobody uses)
is less likely to fail (e.g. stable)
is already well-covered by other tests

Summary

When considering whether to create an automated test, or retain an existing one, consider both:

How valuable is the feedback it provides?
How much will it cost you to maintain and execute it?

Flipping the card wall

2011-02-05T23:30:00+11:00

Most agile software development projects have a "card wall", with each card representing a story, or a task, or some other unit of work.

Typically, the wall is arranged in columns with labels such as: In Analysis, Ready for Dev, Developing, Ready for Test, Testing, and eventually, Done. And typically, cards progress across the wall from left to right. The labels vary, a lot, but the left-to-right thing is fairly standard.

The thing is, I've noticed a tendency for people to concentrate on the leftmost stuff first. That is, your focus drifts towards work that is in-progress, or not yet begun, partly because its the first thing to catch your eye. I argue that this increases the risk of work over in the right-most "almost done" columns languishing, incomplete and undelivered, while the team moves on to other things.

So, on my current project, I convinced the team to try something different. Observe ...

The key difference here is that cards progress from right-to-left, rather than the other way around. Work that is almost finished is on the left, while stuff we haven't started yet is way over on the right. As I suspected, this orientation encourages people (including me) to think about the things that are almost-but-not-quite done, first. My hope is that this will help us pull (as opposed to push) work through our process, and reduce "inventory" at each step.

At the same time, the labels on our columns are all verbs, making it very clear what needs to happen with that card next. These, then, are our columns:

Advertise: If we've recently released useful new features, our quickest path to providing value is to get people using them. We hold a "showcase" every two weeks or so, but often use other channels (email, internal blog, wandering over) to promote our wares, as well.
Release (and Document): Much of our work needs to be made available explicitly, by upgrading a network service, or publishing a new version of a software component. We try to do this as soon as possible once we consider it "done". Part of releasing is making sure that suitable documentation is available, if necessary.
Review: If someone has recently completed a task, we want someone else to review it be we consider it "good to go". Sometimes another member of the team can do it, but usually we'll ask an end-user to give it the seal of approval (or not, as the case may be). Getting this feedback is more important than building new stuff.
Implement: Assuming we've cleared out the previous couple of columns, we can do some actual programming (with frequent pauses to drink coffee and/or insult each other's choice of text editor). We don't use separate columns to distinguish what's being worked on, and what's as-yet unstarted; instead, the presence of one or more people's names on a card indicates that it's getting attention.
Discuss: Sometimes a good whiteboard session is required before we can feel comfortable that a card is ready to be worked on. Often these discussions result in cards being split into smaller, more concrete steps.
Consider: On the right hand side is a cloud of things that are really good ideas. We keep them around to remind ourselves, and others, that while these are really good ideas, we're NOT working on them right now. We don't bother attempting to keep these in priority order, as priorities are likely to change radically before we get to them; instead, we'll do just-in-time prioritisation when space becomes free further to the left of the board.

We keep our current "milestone" (medium-term goal) in plain sight, on a card just above our column headings. This helps us maintain focus, by highlighting when we are tempted to work on things that are "off-mission".

I think this has been a positive change, and has helped us reduce work-in-progress. It's no silver bullet, though. For instance, we still tend to spend too much time (IMHO) thinking about potential upcoming work, as evidenced by the cluster of cards in "Discuss", in the photo above. Still, early days.

Clamp - a Ruby command-line framework

2010-12-06T21:30:00+11:00

In the course of my current project, I've been writing a bunch of command-line utilities. While they're just Ruby scripts, much of their work is interacting with the user: accepting command-line options and arguments, and providing useful feedback in case of errors. So, I wrote a little framework to make it easier. It's called Clamp.

Clamp models a command as a Ruby class, and a command execution as an instance of that class. Command classes look like this:

class SpeakCommand < Clamp::Command

  option "--loud", :flag, "say it loud"
  option ["-n", "--iterations"], "N", "say it N times", :default => 1 do |s|
    Integer(s)
  end

  parameter "WORDS ...", "the thing to say", :attribute_name => :words

  def execute
    the_truth = words.join(" ")
    the_truth.upcase! if loud?
    iterations.times do
      puts the_truth
    end
  end

end

At execution time, Clamp uses the "option" and "parameter" declarations to map command-line arguments onto the command object as instance variables.

There are numerous Ruby libraries out there to help with parsing of command-line options - even a couple built into the standard library - so why write something new? Partly, it's because most of the alternatives only address option parsing. I wanted to focus less on parsing options, and more on modelling the command itself. Clamp is similar in some ways to Thor, the command framework behind Bundler and Rails3 generators, though Thor models commands as methods, rather than classes. It's also quite a bit bigger.

Anyway, next time you're writing a command-line utility in Ruby, I hope you give Clamp a go, and that it makes your job a little bit easier.

"gem require"

2010-12-06T09:00:00+11:00

One thing that irks me about Rubygems is that it provides no cheap way to ensure that a gem is installed. For instance, in a script, I want to check that I've got "heroku" installed, before calling it. My options include:

just assume it's installed, and fail if the assumption is bad;
call "gem install", knowing that it will needlessly re-install the gem if it's already installed;
try calling "heroku", and fall back to "gem install" if it fails.

I'm not happy with any of those options, so a wrote a simple gem command plugin to make it easier. Install it like so:

$ gem install gem_require
Successfully installed gem_require-0.0.3
1 gem installed

Now you can use "gem require" in place of "gem install". It's similar, except that it short-circuits if you already have the required gem installed:

$ gem require heroku
heroku (1.11.0) is already installed

If you want to ensure that you're on the bleeding edge, use the "--latest" option:

$ gem require --latest heroku
Installing heroku (1.14.6) ...
Installed heroku-1.14.6

Of course, if you already have the latest version, there's no need to re-install it:

$ gem require --latest heroku
heroku (1.14.6) is already installed

I hope that helps somebody else.

One project, three gems

2010-09-14T22:00:00+10:00

A couple of weeks ago I gave a presentation at the Melbourne Ruby user-group, talking about three Ruby gems that emerged from my recent work at Lonely Planet.

arboreal is an ActiveRecord extension to support navigation of tree-shaped data.
Representative makes it easy to create XML or JSON representations of Ruby objects.
ShamRack I've blogged about, already.

I'll write more about the first two in weeks to come. Meanwhile, here are the slides:

One project, 3 gems

All aboard the "Release Bus"

2010-09-12T22:00:00+10:00

Spoiler: A "Release Bus" groups story cards to create a tangible representation of what's going into a release.

These days, it's pretty common for software projects to have a "story wall", or "task board", which story cards track across as they progress from "written" to "done".

In a recent project, we were doing exactly that, but were having problems with the "done" cards, which ended up swimming around together in a column labelled "Signed Off". It was difficult to see which of the signed off cards would be included in next release. It was unclear when cards should be removed from the wall. And, some cards didn't even involve software changes, e.g. where we were "spiking", doing some analysis of production data, or tweaking the production infrastructure.

Solution: We created a "Release Bus" — an A3 sheet of paper, onto which releasable cards are placed once they're signed off. This simple sheet of paper suddenly made the release such more tangible. In essence, it's the release-level equivalent of a story card.

It also provides a handy place to track other aspects of the release, like the release number, and a candidate build number. We typically have a "release bitch" [sic], responsible for tracking the release into production, and the bus was a handy place to capture that, too.

Best of all: as the release moves off through the final stages of pre-release testing, and into production, we have a single artifact that we can move off into an archive, rather than dealing with individual story cards.

In the picture above, you'll also notice a "pedestrian" area; we use this for those operational or investigative cards that didn't result in releasable software.

All in all, a useful wee innovation.

Careful Software Development

2009-09-03T22:00:00+10:00

Enough with this seat-of-your pants "eXtreme Programming", this twisty-turny "Agile" bollocks, and this anorexic "Lean"! You're all a bunch of cowboys!

The time has come for a return to a more sensible, conservative approach. Just be a little more careful, okay?

Careful, don't waste your money (and time)

It's sensible to ensure that what you're building has real business value. Ideally, you should be building the highest-value stuff first. Anything less, and you're being careless with your (or someone else's) money.

Careful, make sure it's useful

It's worth saying again ...

If you build stuff that users don't want, nobody's going to end up happy. I suggest you keep in close contact with them. Show them what you're building for them as soon as possible (yes, even before it's "finished"), so they can help keep you pointed in the right direction.

It's important to listen to your users, but don't forget that they (like you) only have part of the picture. You need to work with them to find the best way to solve their problems, rather than just implementing their proposed solutions without question.

Even when you eventually understand what the requirements are, the rest of the team might not. So you'd best write the requirements down somewhere. Somewhere safe. Code (in the form of automated tests) is an excellent way to express requirements, because (unlike other options) it's hard to misinterpret.

Careful, don't miss the turn-off

Plans are great; make sure you have one. But don't follow it blindly. Once you get going, you'll probably discover lots of things you didn't know when you made the plan. You may even stumble upon some useful short-cuts: ways to deliver more value, for less cost! Don't let the plan prevent you from taking those opportunities.

After all, even if you have a map, it's still useful to look at the road ahead. Where'd you get that map, anyway? Oh, you drew it yourself?! Well, why not slip that back into it's protective plastic cover, so it doesn't get grubby. There ya go. And keep an eye out for road signs, okay? Excellent.

Careful, free stuff isn't always cheap

You may find something that looks like it might solve all (or some) of your problems. Perhaps some free software, or a product you already have licenses for. That would be nice.

But beware! Adapting your problem to someone else's solution is hard, and if you fail, leaves you in a delicate situation. In some situations, rolling your own solution works out cheaper in the long run.

Careful, don't bite off more than you can chew

Q. How do you eat an elephant? A. One bite at a time.

For god's sake, don't give yourself indigestion by tackling the whole trunk at once! I had a cousin who did that, and she still can't ride a bike properly.

Slow and steady wins the race.

Careful, don't break the stuff that's already working

Software's not easy, you know. It gets complicated inside that big box. And hot. You might accidently trip over some important wires while you're in there installing the new fnord-wangling module.

You know those automated tests, ummm I mean requirements, you captured earlier? Just run them again. Actually, run them as frequently as you can; that should alert you to any inadvertent mistakes as soon as they occur.

Careful, don't let it fail in production

Oh boy, if it goes wrong in production, your users are going to be mighty grumpy. That's embarassing. And these things tend to happen whenever it's least convenient. You might just have to cancel that ski-trip.

Careful, too, to have good processes for finding and fixing production issues if and when they do occur. Make sure you can turn fixes around quickly. You may make the ski-fields yet!

Careful, don't step on your own toes

Make sure that the team is all working towards the same goal; all pulling in the same direction. Yeah, unfortunately that means talking to each other. All the time. But otherwise, you'll just get in each other's way.

Careful, don't be left holding the (ugly, vomiting) baby

There's a chance (actually, let's call it a certainty), that the rest of the project team will desert you. The clever ones are likely to leave first.

So, I suggest you ensure you have a decent understanding of the system before they go. Documentation might help. But probably not: most of it tends to be kind of useless. You're probably better off working with them a bit while they're still around. That way you can ask them questions.

And if you work together, you'll likely produce a more maintainable result, anyway.

Okay, then

In the immortal words of Sergeant Phil Esterhaus:

"Let's be careful out there."

Thanks for listening.

Introducing ShamRack

2009-07-03T15:50:00+10:00

The system I'm currently working on integrates with several external systems, over HTTP, using simple (RESTish) web-services. I really don't want to involve those external systems while testing my own, though; I want to stub 'em out.

My first attempt involved stubbing out HTTP calls using my mocking framework of choice. I'm using RestClient, which I like a lot, and stubbing out RestClient API calls worked quite well. It kept on working quite well for several hours, until I decided to refactor a little, using RestClient in a slightly differently way, at which point it broke completely. Bother. I really don't like having tests coupled to implementation details, so went searching for another way.

FakeWeb looked pretty good, in that it stubs things out at the Net::HTTP layer, which I'm unlikely to refactor out of the picture. In the end, though, it's not really what I wanted. I wanted to be able to do things like:

verify the body (and mime-type) of a POST/PUT request
dynamically generate responses, based on some aspect of the request (e.g. query parameters)

In short, I wanted a Fake Object, rather than a simple stub.

It occurred to me around about then that we already have plenty of tools for describing the behaviour of web-applications: they're called web-application frameworks! Many of them are too heavy-weight for my purposes, but Sinatra is nicely minimal. So, 60 lines of Ruby code later, I had a little web-app that mimicked one of those external web-services sufficiently for my testing. Win!

But waitaminut. I really don't want to have to start a separate process running my fake web-service, and talk to it using HTTP. That's going to be slow: network I/O isn't cheap. Isn't there some way I can use something like Sinatra but still keep everything in-process?

There is now. ShamRack plumbs Net::HTTP directly into applications built to run on Rack. Which includes all Sinatra apps, as well as Rails, Merb, etc.

Using ShamRack, I avoid the network traffic, making the tests a whole lot faster (about 25 times faster, in my case). Plus, I avoid the complication of having to start and stop an external web-server. Finally, because my fake web-service app is in-process, I get a handy back-channel I can use to setup or inspect it's state during tests.

If you find ShamRack handy, or have ideas about how it could improve, let me know!

Spying on your code with RR

2009-05-27T22:39:00+10:00

A while back, Melbourne's own Pete Yandell created Not A Mock, an extension to RSpec that supports test-spies. And a damn fine idea it was, too.

I've recently discovered that my current favourite stub/mock framework, Brian Takita's RR, can do test-spies too!

Huh?

What's this "spy" business about? Well, when mocking, before triggering the behaviour you're testing, you set up expectations that a certain methods of collaborating objects will be invoked, with the specified parameters. Like so:

describe TransferEverything do

  before do
    @account1 = Account.new
    @account2 = Account.new
    @transfer = TransferEverything.new(:from => @account1, :to => @account2)
  end

  describe "#execute" do

    it "moves all funds from one account to the other" do

      all_the_money = 1.42
      stub(@account1).balance { all_the_money }

      mock(@account1).withdraw(all_the_money)   # <= set expectations
      mock(@account2).deposit(all_the_money)

      @transfer.execute                         # <= execute
      
    end                                         # <= verify expectations

  end

end

The expectations are typically verified auto-magically, by the mocking framework, at the end of your test.

The spy alternative

Setting up expectations before a call always feels clumsy. Using a test spy makes tests flow more naturally:

Stub out collaborators, setting up canned responses where required.
Execute the code you're testing.
Verify the results, including both:

the outputs (return values, or resulting state)
the interactions (ie. the method-invocations you expected your fake collaborators to receive).

Fur egg-sample:

describe TransferEverything do

  # ...

  describe "#execute" do

    it "moves all funds from one account to the other" do

      all_the_money = 1.42
      stub(@account1).balance { all_the_money }
      stub(@account1).withdraw
      stub(@account2).deposit

      @transfer.execute

      @account1.should have_received.withdraw(all_the_money)
      @account2.should have_received.deposit(all_the_money)

    end

  end

end

One thing I find particularly useful about this technique is the ability to execute code in a setup block, then verify the various aspects of it's behaviour in separate test-cases.

describe TransferEverything do

  # ...

  describe "#execute" do

    before do
      @all_the_money = 1.42
      stub(@account1).balance { all_the_money }
      stub(@account1).withdraw
      stub(@account2).deposit
      @transfer.execute
    end

    it "withdraws all funds from source account" do
      @account1.should have_received.withdraw(all_the_money)
    end

    it "deposits funds in receiving account" do
      @account2.should have_received.deposit(all_the_money)
    end

  end

end

This results in smaller, more coherent test-cases.

Using RR test-spies in RSpec

If you're using RSpec, you'll need to use the adapter class that comes with RR, rather than the one that comes with RSpec. That is, in your spec_helper.rb, do this, which provides access to the have_received matcher.

require 'rr'
Spec::Runners.configure do |config|
  config.mock_with RR::Adapters::Rspec
end

Spying on Java

Honourable mention: if you're lucky (*cough*) enough to be coding Java, I HIGHLY recommended Mockito, which also implements test-spies, and is easily the best Java mocking/stubbing library around.

Rsync: 1, Time Machine: 0

2008-11-11T23:30:00+11:00

I recently bought a Mac Mini to serve various purposes about the house - not least of which, as a remote backup server for my MacBook Pro.

At which point I spent several evenings wrestling with Time Machine, with limited success. I moved my existing (500G, external) drive to the Mac Mini, shared it, and nominated it as my backup volume. But:

Time Machine wouldn't recognise the existing backups on that drive, and insisted on starting again from scratch (because it creates sparsebundle disk images for remote backup clients, but not for the local system). Annoying.
The initial backup took forever, because TM backs up everything not specifically excluded. (Granted, I'm backing up over a 801.11g wireless network).
Incremental backups kicked in every hour, and even when I hadn't been altering files, seemed to take an excessive amount of time to complete, ie. around 15 minutes. Much of this time was spent "preparing", and affected the performance of both my laptop, and the network. I don't need or want hourly backup, but TM provides no way to set a less demanding schedule.
Several times things got borked when I interrupted a backup midway, and I had to reboot, remount or otherwise intervene to get things working again.

Eventually, I gave up, and went looking for alternatives. After flirting with rdiff-backup and rsnapshot, I eventually did a little research and rolled my own rsync backup script:

#! /bin/sh

set -e 

snapshot_host=theLoungeRoomMac.local
snapshot_dir=/Volumes/WD_500/Snapshots/woollyams
snapshot_user=root
ssh_user=$snapshot_user@$snapshot_host

ping -o $snapshot_host > /dev/null || {
  echo "WARNING: can't see $snapshot_host -- skipping backup"
  exit 1
}

ssh $ssh_user "test -d $snapshot_dir" || {
  echo "ERROR: can't see $ssh_user:$snapshot_dir" >&2
  exit 2
}
  
snapshot_id=`date +%Y%m%d%H%M`

/usr/bin/rsync --archive --verbose \
  --delete --delete-excluded \
  --numeric-ids --extended-attributes \
  --one-file-system \
  --partial \
  --link-dest ../current/ \
  --relative \
  --max-size=50M \
  --exclude ".git" \
  --exclude ".svn" \
  /private/etc /Users/mdub \
  $ssh_user:$snapshot_dir/in-progress/

ssh $ssh_user "cd $snapshot_dir; rm -fr $snapshot_id; mv in-progress $snapshot_id; rm -f current; ln -s $snapshot_id $snapshot_dir/current"

Advantages over Time Machine are:

I can run this as often or as infrequently as I like.

I'm currently running it out of /etc/daily.local, which is run by periodic, which is run by launchd.
It doesn't get in my way by running while I'm actively using my machine.

I can use the full power of rsync filter rules to exclude uninteresting files (e.g. "--exclude .git --exclude .svn").
I can even filter by file size ("--max-size=50M") to skip things like big downloads and VMware images, without having to explicitly nominate them.
It takes less than 3 minutes to perform an incremental backup (providing I haven't changed too much).
I can safely interrupt the backup process, or pull the plug, or whatever, and it's robust enough to carry on where it left off next time.
I can keep as many time-stamped snapshots as I wish.
It's relatively efficient space-wise, due to the use of hard-links to share unchanged files between snapshots (not as efficient as Time Machine, though, which hard-links entire directories).
Each snapshot is a simple, easy-to-browse, easy-to-search directory, containing plain old files and directories. It gives me comfort that I wouldn't need a spiffy GUI to locate a file I was looking to restore.

"Continuous Integration" might not mean what you think it means

2008-07-08T23:25:00+10:00

Continuous Integration is a common practice in Agile development circles, but I think people (especially those new to agile thinking) sometimes miss the point.

Problem is, the term has become synonymous with build-servers such as CruiseControl (etc, etc), which frequently grab the latest code, build it, and execute automated tests. These are often referred to as "continuous-integration servers", which IMHO is a really bad name, 'cos if there's one thing these servers typically don't do, it's integrate.

And the point of continuous-integration is just that: Integrating. Continuously! Which means:

developers frequently updating their working-areas (or personal branches) with the latest code on the mainline branch (typically many times a day), and
frequently merging their own changes back into the mainline (typically several times a day).

Unless you're doing this, you ain't "doing continuous integration", however frequently you're running automated builds!

Integrating continuously can be difficult. In particular, it forces you to chunk larger changes and features into small, bite-sized pieces that can be drip-fed into the codebase. And, you have to deal with other developers changing stuff all the time. Build-servers and automated tests are essential tools here, because they help keep the team honest, ensuring that everyone has a stable (if evolving) base to work on.

There are are plenty of upsides to frequent integration:

each individual integration is smaller, and therefore easier
design issues (including differences of opinion) are identified earlier
developers can leverage each other's work earlier
changes can be tested (and bugs detected) earlier
software can be deployed more frequently

In summary: check it in already!

Attacking slow-running builds (notes from CITCON)

2008-07-01T22:20:00+10:00

Last weekend I went along to CITCON here in Melbourne. Which was great fun, by the way.

There I ran a session on "Attacking slow-running CI builds". It was a small group, but an interesting discussion, I think. Here are my (rough, unedited) notes:

WHAT is the impact of a slow build?

fewer checkins
more waiting
context switching
discourages integration
discourages writing of additional tests
more chance of overlapping checkins
more build breakages
more time required to get the build fixed
reduced productivity
WASTE!

WHY is the build slow?

slow tests (particularly acceptance tests)

over-testing (testing the same code-paths repeatedly)
expensive set-up and tear-down
too much testing via the user-interface
tests that pause, sleep, or poll (e.g. to deal with AJAX)

too much I/O!
use of slow infrastructure components (database servers, application servers, etc.)
slow hardware

HOW can we make it faster?

faster hardware
run tests in parallel
distribute tests
fail fast

selective testing: run tests most likely to fail first

could use dependency-analysis to identify which tests were affected by recent commits

refactor story-based acceptance tests into scenario-based tests

bigger tests, with more assertions, offsets set-up/tear-down costs

but makes tests more complex

share test fixtures between a group of tests

but breaks test isolation

avoid I/O

in-memory database
in-memory file-store (RAM disk?)
stub out infrastructure components

avoid testing these components by side-effect

populate the database directly, rather than using the user-interface to set-up for a test
separate your system into components that can be tested independently

Thinking about this later ...

There are two types ...

The suggestions for improving build times seemed to fall into two categories:

optimise the build/tests
throw additional hardware at the problem

My problem with the "throw hardware at it" approach is that it typically only helps for the build-server machine; the poor old developers are still left with a slow-running build, and therefore many of the productivity issues still exist.

Another idea

It occurs to me now that we missed a fairly fundamental trick to improve test times: improve the performance of the system-under-test itself. It's a great excuse to start thinking about performance earlier in the project.

"Customer Acceptance Test" does not need to mean end-to-end

On all the projects I've been on in recent years, we've ended up with the majority of the tests being either "developer unit tests", which run super-fast, or "customer acceptance tests" which test end-to-end (browser-to-database) and run super-slow.

Methinks it should be less black-and-white. If we can demonstrate functionality that the customer cares about by calling the underlying logic directly (i.e. at unit-test level), rather than by exercising the user-interface, then what's wrong with that? (We just need one test to prove that the underlying logic has been properly integrated into the UI.)

Railsconf 2008 Highlights

2008-06-05T22:00:00+10:00

I was lucky enough to be at Railsconf 2008 in Portland last weekend (along with Marty, Rob, Trav and Abhi).

Highlights

Meeting other Ruby/Rails enthusiasts from all over. (Well, all over the US, at least).
Joel Spolsky's opening keynote was hilarious (in a good way). Some other commentators found it low on content, but I thought it had a strong message: usability matters!
Seeing Kent Beck present was fantastic. He had the audience hanging on his every word, as he described how "anything he'd done had taken 20 years to have an impact".
Ezra's talk on Vertebra, his XMPP-based "cloud control" project, was fascinating. What a great abuse of technology!
The JRuby and Rubinius teams are co-operating closely, in a spirit of friendly, respectful rivalry. Particularly notable is their effort to collaborate (with each other, and Matz) on a rigourous set of executable specs for Ruby language.
The upcoming version of Phusion Passenger will support not only Rails applications, but also Rack (and therefore Merb, Sinatra, Camping), and (get this) WSGI (and therefore a bunch of Python frameworks, including Django)!
There are increasingly varied options for deploying Rails apps, including the traditional {Apache,nginx}+{mongrel,thin}, JRuby WARs in a servlet container, Passenger, and the Amazon-EC2-based services like RightScale and Heroku. Heroku's deployment model is pretty damn clever: just "git push".

Regrets

With 4 streams going on, the talks I got to were naturally out-numbered by those I missed. Some of the ones I really wish I'd seen include:

MagLev: Gemstone's Ruby implementation-in-progress, based on their Smalltalk VM
Scott Chacon on "Using Git" (apparently he went into mind-bending detail of the Git internals)
Justin Gehtland's "Small Things, Loosely Joined, and Written Fast"

Git (on the Mac)

2008-04-18T13:30:00+10:00

Git is the hype. I'm just starting to use it for a couple of projects, both directly, and as a local facade to Subversion.

Here are some suggestions on using git under Mac OS X.

Installation

Installation using MacPorts is pretty painless. Ensure you choose the "svn" variant if you want Git/Subversion integration.

sudo port install git +svn +doc

Another option is the native installer, available at http://code.google.com/p/git-osx-installer/

Textmate

If you use Textmate, the Git Textmate bundle is rather nice.

cd ~/Library/Application\ Support/TextMate/Bundles
git clone git://gitorious.org/git-tmbundle/mainline.git Git.tmbundle

Remember to set the TM_GIT variable (to "/opt/local/bin/git" or "/usr/local/bin/git", as the case may be), otherwise stuff won't work.

Shell completion

For command-line (bash) users, there's TAB-completion available, which is pretty handy. I'm using it directly from my local clone of the git source tree, like this:

# in .bashrc ...

git_completion_script=$HOME/OpenSource/kernel.org/git/contrib/completion/git-completion.bash
if test -f $git_completion_script; then
  source $git_completion_script
fi

GitNub for history browsing

GitNub is a sweet little UI for browsing history of git commits.

Using Git

So far, I haven't talked at all about how you actually USE the thing, and don't intend to, since there are already so many great resources out there on the subject. Some I've found useful are:

A tutorial introduction to git
Git for SVN users
Git cheat-sheets from Zack Rusin and Err the blog
Andy Delcambre's Git SVN Workflow

ReadOnlyFormBuilder

2008-03-08T22:00:00+11:00

For RubyOnRails developers, form_for and fields_for are the accepted way of DRYing up form templates. You know the deal; you code

<% form_for :customer, :url => customers_path() do |customer_form| %>
  <p>
    <label>Name:</label> 
    <%= customer_form.text_field :first_name, :size => 15 %>
    <%= customer_form.text_field :last_name, :size => 20 %>
  </p>
  ... etc ...
<% end %>

and you get

<form action="/customers" method="post">
  <p>
    <label>Name:</label> 
    <input id="customer_first_name" name="customer[first_name]" size="15" type="text" />
    <input id="customer_last_name" name="customer[last_name]" size="20" type="text" value="" />
  </p>
  ... etc ...
</form>

Rails generates sensible field names and ids for you, and slurps existing values out of the model object. So far, so good.

Lately, I've taken to using the same trick when presenting data, not just when editing it. So, whereas before I might have written:

  <p>
    <label>Name:</label> 
    <span id="customer_first_name"><%= h @customer.first_name %></span>
    <span id="customer_last_name"><%= h @customer.last_name %></span>
  </p>
  ... etc ...

I'll now code it up as:

<% fields_for :customer, :builder => ReadOnlyFormBuilder do |customer_form| %>
  <p>
    <label>Name:</label> 
    <%= customer_form.text_field :first_name, :size => 15 %>
    <%= customer_form.text_field :last_name, :size => 20 %>
  </p>
  ... etc ...
<% end %>

and get the same output. (In case you're wondering, the ids are there to help with automated testing).

Note the similarity between the last code snippet and the first one on this page; apart from the first line they're indentical. Usually, I'll put the field-declarations themselves in a partial that's shared between "new", "edit" and "show" actions. That way, your "show" page automatically gets identical layout to the others, just with raw values in place of editable fields.

The ReadOnlyFormBuilder class itself it fairly straightforward - I'm planning to wrap it up into a plugin sometime soon. In the meantime, the implementation of text_field looks something like this:

def text_field(attribute, options={})
  content_tag("span", html_escape(value_of(attribute)), :id => "#{@object_name}_#{attribute}")
end

def value_of(attribute)
  value = model.send(attribute)
end

def model
  @object || @template.instance_variable_get("@#{@object_name}")
end