TurnKey Linux Virtual Appliance Library

Background task processing and deferred execution in Django

Or, Celery + RabbitMQ = Django awesomeness!

As you know, Django is synchronous, or blocking. This means each request will not be returned until all processing (e.g., of a view) is complete. It's the expected behavior and usually required in web applications, but there are times when you need tasks to run in the background (immediately, deferred, or periodically) without blocking.

Some common use cases:

  • Give the impression of a really snappy web application by finishing a request as soon as possible, even though a task is running in the background, then update the page incrementally using AJAX.
  • Executing tasks asynchronously and using retries to make sure they are completed successfully.
  • Scheduling periodic tasks.
  • Parallel execution (to some degree).

Good automation vs bad automation

I recently eliminated a bit of code that was supposed to handle upgrading our build infrastructure from using one distribution (e.g., Ubuntu 8.04 LTS) to another (e.g., Ubuntu 10.04 LTS). That got me thinking about how to decide (and then explain) when it's a good idea to automate and when it isn't.

Django Signals: Be lazy, let stuff happen magically

When I first learned about Django signals, it gave me the same warm fuzzy feeling I got when I started using RSS. Define what I'm interested in, sit back, relax, and let the information come to me.

You can do the same in Django, but instead of getting news type notifications, you define what stuff should happen when other stuff happens, and best of all, the so-called stuff is global throughout your project, not just within applications (supporting decoupled apps).

For example:

  • Before a blog comment is published, check it for spam.
  • After a user logs in successfully, update his twitter status.

What you can do with signals are plentiful, and up to your imagination, so lets get into it.

Friends don't let friends program in shell script

Lately I've been going over a hellish patch-work of old shell scripts we wrote to automate some internal processes and I realized something: friends shouldn't let friends program in shell script.

Why?

Using git and rsync to synchronize changes on a staging box to a live server

The problem: working on a live web site is a bad idea

Anyone who's ever worked on a sufficiently complex web site knows it's a bad idea to work directly on the live server hosting the site for a couple of important reasons:

  1. It's disruptive to visitors: If - sorry when you break something - your visitors are going to be exposed to it. Nothing creates a bad impression faster than a broken web site.
  2. Fear is stressful, stress kills productivity: you know if you mess around too much with the web site there's a good chance you'll break it. Naturally you don't want this to happen so your mind becomes preoccupied with the fear of making mistakes, and its hard to focus on what needs to be done.

We develop this web site and test all non-trivial changes in a local TurnKey Drupal instance running inside a virtual machine. This means we can experiment and screw things up with no consequences. I find removing that source of stress makes you much happier and more productive as a web developer.

Working like this raises a few practical questions though:

  • How do you push changes from the development box used for staging to the live web site without accidentally overwriting changes made by someone else?
  • How do you track who changed what?
  • When you screw things up on your development box, how do you reset the changes you've made and start again?