Running database migrations on Cloud Foundry

19 Jun, 2013

An important part of deploying a database-backed application is keeping the database schema up-to-date. For Rails applications, you typically do that by running:

$ rake db:migrate

For Rails applications running on Heroku, you typically run the migrations immediately after deploying a new version of your app, e.g.

$ git push heroku master
$ heroku run rake db:migrate

Since Cloud Foundry takes after Heroku in so many ways, I expected to use a similar workflow when deploying my Rails application on Cloud Foundry. I was suprised to discover that CF does not support any equivalent of "heroku run". That is, there's not (yet) a built-in way to run a Rake task or shell-command in the context of the currently deployed application.

The Cloud Foundry documentation suggests a somewhat surprising solution: alter the startup command for your application to execute migrations before (re)-starting the web-app. I duly did so by specifying a custom start command in the Cloud Foundry "manifest.yml" for my app:

---
applications:
- name: barfly
  command: "bundle exec rake db:migrate && bundle exec rackup -p $PORT"

We don't really want to run the migrations on every app-server instance, though. Luckily, Cloud Foundry provides meta-data to each instance, in the form of an environment variable, $VCAP_APPLICATION. Specifically, it provides an "instance_id" key, which contains the a unique number for every instance.

So, I created a Rake task to limit an action to the first instance, ie. the one with instance_id == 0.

namespace :cf do

  desc "Only run on the primary Cloud Foundry instance"
  task :on_primary_instance do
    instance_index = JSON.parse(ENV["VCAP_APPLICATION"])["instance_index"] rescue nil
    exit(0) unless instance_index == 0
  end

end

If the instance_id is non-zero, or unset, the task exits Rake early, skipping any subsequent tasks. With this in place, I altered the startup command to make use of the new task:

---
applications:
- name: barfly
  command: "bundle exec rake cf:on_primary_instance db:migrate && bundle exec rackup -p $PORT"

This actually works fairly well; the migrations run, then the application-server starts up. Of course, if the migrations fail, the app-server won't start. That's fine with me, for now, and I imagine it would be acceptable for many apps.

Others, though, might want/need tighter control over when database migrations run, rather than just running them automatically on boot. With that in mind, an alternative approach would be to handle migrations entirely separately from application deployments. Cloud Foundry provides easy access to information about database (and other external) services, e.g.

$ cf file barfly logs/env.log | grep DATABASE_URL
DATABASE_URL=postgresql://deadb33f:5Dvc0ePHMrwFUuODGQiSYWCYHU-nIzu-@babar.elephantsql.com:5432/deadb33f

Using those connection details, the deployment script could connect to and migrate the target database schema, prior to the "cf push" that updates the app. In many ways, this is preferrable to the push-migrate workflow typically used with Heroku, as it creates the flexibility to run constructive migrations independent of application deployments.

Feedback