Using git and rsync to synchronize changes on a staging box to a live server

The problem: working on a live web site is a bad idea

Anyone who's ever worked on a sufficiently complex web site knows it's a bad idea to work directly on the live server hosting the site for a couple of important reasons:

  1. It's disruptive to visitors: If - sorry when you break something - your visitors are going to be exposed to it. Nothing creates a bad impression faster than a broken web site.
  2. Fear is stressful, stress kills productivity: you know if you mess around too much with the web site there's a good chance you'll break it. Naturally you don't want this to happen so your mind becomes preoccupied with the fear of making mistakes, and its hard to focus on what needs to be done.

We develop this web site and test all non-trivial changes in a local TurnKey Drupal instance running inside a virtual machine. This means we can experiment and screw things up with no consequences. I find removing that source of stress makes you much happier and more productive as a web developer.

Working like this raises a few practical questions though:

  • How do you push changes from the development box used for staging to the live web site without accidentally overwriting changes made by someone else?
  • How do you track who changed what?
  • When you screw things up on your development box, how do you reset the changes you've made and start again?


When we started we didn't give that much thought to these issues and would just rsync a bunch of random directories around in ad-hoc fashion. That inevitably led to a few nasty mistakes, which convinced us this is something we needed to think through.

Our solution: volatile-pull, volatile-sync

Here's what we came up with:

  • One volatile directory to rule them all: we moved directories that were being changed (e.g., theme, modules) to /volatile and created symlinks from their original, sporadic locations on the filesystem.

    A single volatile directory was much easier to keep track of mentally than an ad-hoc collection of directories.

  • Revision control: on our development instances, we enabled revision control on /volatile by turning it into a Git repository with two main branches:

    1. 'local' branch: where we committed our changes.
    2. 'remote' branch: contains a representation of the live /volatile

We then wrote a simple script (download) which supports two operations:

  1. volatile-pull: pulls changes from the live server to the development instance
    1. rsyncs the live /volatile to the 'remote' branch and commits.
    2. merge the 'local' branch with the 'remote' branch while allowing the developer to resolve any conflicts between their own changes and the changes made by another developer.
  2. volatile-sync: synchronize the development instance with the live server
    1. call volatile-pull: before pushing our changes to the live server we always pull to minimize the risk we will accidentally overwrite changes made by another developer since we last pulled.
    2. rsync the contents of the 'local' branch to the live server's /volatile

This technique should generally work for many similar development scenarios, not just Drupal web site development.

Note that pulling just before we push does not absolutely eliminate the risk of accidentally overwriting changes made by another developer, if they happen to be pushing at exactly the same time. To prevent this edge case you would need to implement some sort of locking mechanism. We didn't bother.

Download: volatile-sync.tar.gz

Database synchronization considerations

With Drupal an extra caveat is that much of the web site lives in the database, so filesystem-level synchronization only solves a part of the problem.

Since the state of a live dynamic web site can change while you are working on the development instance (e.g., new users, new forum posts) we only pull the database from the live web site server to the development instance. Never in the other direction.

We only made an exception when we were upgrading the web site from Drupal 5 to Drupal 6. That required a massive database update we felt more comfortable preparing on the development instance and pushing out to the new Drupal 6 based live web site. In the meantime, we put a notice in the template of the old Drupal 5 site warning users that any change to the web site would be lost. 15 minutes later, the new Drupal 6 base site was up.

Do you use a staging box? How do you synchronize? Don't hog your knowledge, leave a comment!


Chaim Krause's picture

Jesse Freeman talks about using version control for his website.

Liraz Siri's picture

Thanks for the link Chaim! Jesse's setup is interesting but I think it is a bit overly complex. One of the hallmark of a good design is that you get the job done as simply as possible (e.g., fewest elements)

Also when possible I prefer to set things up so that they just work without having to waste any mental cycles on them. That's part of the reason we fully automated the synchronization process for the web site. Then when I'm ready to deploy I don't really have to think about the process. It just works.


Add new comment