TurnKey Linux Virtual Appliance Library
Backup and Migration (TKLBAM)
General

TurnKey Linux was started in 2008 as an open source project by Alon Swartz and Liraz Siri, inspired by the principle of Ubuntu, an African word meaning "humanity towards others". As technology enthusiasts we felt we benefited enormously from open source. The software, the ideals of freedom, borderless cooperation, and a counter-intuitive gift culture based on enlightened self interest. TurnKey was our way of giving back to the community that had given so much to us. That's how it started. Since then, new friends have joined us from all over the world who share our passion and values and TurnKey has become much more than just a software project.

In a nutshell: trust, but verify.

Trust (the short version): we have a reputation to maintain and are not naive regarding the risks. Since 2008, we've put a tremendous amount of work into TurnKey. Nearly a million appliances have been downloaded. It's an open source project. There's no place for bad stuff to hide. If there was any funny business going on, someone would have surely discovered our evil plans for world domination by now.

Verify: Since a TurnKey Linux virtual appliance is built almost entirely from unmodified Ubuntu binaries, it is possible for anyone to verify the integrity of the binaries that make up a virtual appliance against the original package signatures from the official Ubuntu repositories.

There are minor exceptions. When required, a virtual appliance may contain a few custom packages which are updated from our cryptographically signed package repository. Full source code for all custom components is available in our code repository. Some components are also hosted on github.

To prevent tampering, we sign all releases so that users can cryptographically verify the integrity of their downloads. Also, our virtual appliances are configured to automatically verify the cryptographic integrity of any package (including custom components) that is installed through the package management system (e.g., automatic security updates).

In other words, users should be able to trust a TurnKey Linux virtual appliance as much as they trust a normal general-purpose installation of Ubuntu.

If there is anything else we can do to satisfy our more paranoid users, please let us know.

(and not Redhat/CentOS or Novell SUSE, or Gentoo, or Slackware, or Linux from Scratch, etc.)

The short answer is that like millions of other Linux enthusiasts we have grown to love Ubuntu.

It's no accident that Ubuntu has quickly grown to be the largest and most popular Linux distribution in the world by a significant margin.

Ubuntu embodies "humanity towards others" in a way that inspires a deep passion amongst its users.

By following the Ubuntu Code of Conduct, the Ubuntu community has developed into a friendly oasis in which everyone is invited and treated with respect, whether they are technical gurus or uninformed newcomers.

At a technical level, Ubuntu is by far the most transparent of any distribution with major financial backing. All development happens out in the open. There are so few boundaries between part-time community volunteers and full-time employees that it can be hard to tell them apart. This makes Ubuntu much easier to work and collaborate with if you're a developer.

Finally, unlike other commercial Linux distributors, Ubuntu isn't distracted by the inherent conflicts of interest in maintaining a premium for-pay product and a free community edition. Ubuntu is all free, including updates!

What about Debian?

We couldn't love Ubuntu without loving Debian too, and in the future we'd like to work on building the TurnKey virtual appliance library on top of Debian as well.

Note that behind the scenes Ubuntu is based on Debian, one of the oldest and by far the largest of the non-commercial Linux distributions, with over a thousand dedicated voluteer developers, and more than 23,000 packages in its software repositories. Debian does much of the heavy lifting for Ubuntu behind the scenes, but Ubuntu certainly deserves credit for taking Debian the last mile and delivering its technical excellence to such a wide audience.

Ultimately, whether or not most Ubuntu users realize it Debian is a long term insurance policy in the remote case that something ever goes terribly wrong with Ubuntu's commercial sponsor Canonical. One of Debian's greatest strengths is that it has no single point of failure. In a worst case scenario, Debian will be able to offer a safe and free migration path for former Ubuntu users.

Ubuntu Server is a general purpose platform which a system administrator can use to integrate his or her own custom Linux server. If that is what you want then we recommend Ubuntu Server highly.

By contrast, a TurnKey Linux virtual appliance is designed to fill a specific niche role as efficiently and easily as possible. If that is what you want you could save yourself or your organization valuable time and energy by using an existing TurnKey virtual  appliance (I.e., assuming one exists for that role).

There are also a few secondary advantages you might find attractive:

  • Leaner footprint: A TurnKey virtual appliance is only as large as it has to be, so instead of downloading 600MB installation ISO full of packages you will never use, a TurnKey virtual appliance starts from just 150MB.

  • Faster, easier install: Installing a TurnKey virtual appliance usually takes around one minute and is typically much easier than installing a comparable system (e.g., LAMP stack) via Ubuntu Server's standard installer.

Backup and Migration (TKLBAM)
Overview
Yes, TKLBAM is licensed under the GPL3. You don't have to care about free software ideology to appreciate the advantages. Any code running on your server doing something as critical as encrypted backups should be available for peer review and modification.

TKLBAM is short for TurnKey Linux Backup and Migration. It's designed specifically for TurnKey Linux and depends on many system-level details that don't necessarily apply to other Linux distributions (e.g., installation method, versioning signatures, etc.).

In the future, we may figure out how to extend the design to support additional operating systems, but it's not trivial and we don't have a timeline on when, or even if, that will happen.

In the meantime, if you really want to use TKLBAM, consider virtualization-based workarounds. For example, if you install a TurnKey Linux VM on top of a Windows Server installation, you could use TKLBAM to backup anything that goes into the TurnKey Linux VM.

All except Zimbra and the PostgreSQL based appliances (PostgreSQL, LAPP, OpenBravo). PostgreSQL support is in the works but it's not ready yet.

Regarding older versions of TurnKey, any version of TurnKey from 2009.02 onwards will work with TKLBAM, including all beta versions.

Currently only MySQL. PostgreSQL support is under development. Support for additional databases will be added as needed. Currently TurnKey appliances only include MySQL and PostgreSQL databases.

On any system descended from a TurnKey Linux installation, regardless of hardware or location. Storing backups to Amazon S3 is easiest because authentication and key management are automatic. You just need to run:

tklbam-backup

But you can also backup to any storage target supported by TKLBAM's back-end Duplicity including the local filesystem, NFS, Rsync, SSH, FTP, WebDAV, Rackspace CloudFiles and even IMAP.

See also:

Costs

TKLBAM and the TurnKey Hub are free. To enable Backups on your TurnKey Hub account you'll need to sign up for cloud storage on Amazon S3 which charges around $0.15/GB per month. Full details of Amazon S3 pricing can be found here.

You can use simulation mode to calculate how much uncompressed data TKLBAM is going to store in a full backup:

$ tklbam-backup --simulate
CREATING /TKLBAM
FULL UNCOMPRESSED FOOTPRINT: 148.30 MB in 6186 files

In practice, the actual footprint of a full backup will usually be smaller due to compression, but this depends on the type of data being compressed (e.g., text compresses very well, video very poorly).

By default, a full backup is performed if one month has passed since the last full backup. In between, incremental backups will be performed which only record changes since the last backup. The full backup frequency can be customized. See this manual page for details.

If you notice $0.00 in the backups console, there's no need to open a support request. It's not a bug. At 15 cents per gigabyte, if you have just a few megabytes of data Amazon doesn't charge you anything.

Backups start from about 10KB for a freshly installed TurnKey appliance. Remember, TKLBAM only saves changes you've made since the appliance was installed. 

In fact, a significant number of users are being charged less than 1 cent a month.

Usage

Sorry, if your server is gone (e.g., terminated EC2 instance) nobody can help you. Next time either save an escrow key somewhere[s] safe or don't set a passphrase.

Don't misunderstand, we'd love to help if we could, but we can't. The encryption key for your backup was generated locally on your server not ours. We designed passphrase protection to use special cryptographic countermeasures to make typical cracking techniques (e.g., dictionary attacks) very difficult even for someone with access to massive amounts of computer resources.

Note, if the system you backed up is still available, just log into it as root and change the passphrase (you don't need to know the old passphrase):

tklbam-passphrase

No, but if you backup more frequently (e.g., hourly instead of daily), we strongly recommend creating full backups more frequently - daily or weekly instead of monthly (I.e., the default).

The reason is that long backup chains are inefficient and more vulnerable if something goes wrong as links in the chain depend on one another.

If you backup daily, and do a full backup monthly, your backup chains will consist of a full backup with at most 31 incremental backups linked to it. But if you backup hourly, by the end of the month your backup chain could consist of up to 744 incremental backups, all of which have to be downloaded and extracted when you restore.

To configure automatic hourly backups with a full backup every 7 days:

mv /etc/cron.daily/tklbam-backup /etc/cron.hourly
chmod +x /etc/cron.hourly/tklbam-backup

echo full-backup 7D >> /etc/tklbam/conf

By default, a full backup will happen if the last full backup is older than 30 days. Between full backups, all backup sessions are incremental.

We recommend enabling the daily backup cron job so that daily incremental backups happen automatically:

chmod +x /etc/cron.daily/tklbam-backup

You can override the default by setting the full-backup parameter in the tklbam configuration:

# create a full backup every 14 days
echo full-backup 14D >> /etc/tklbam/conf

A full backup is a backup that can be restored independently of any backup other backup. An incremental backup links with the last backup before it and only includes changes made since.

Backup chains are links of backup sessions which start with a full backup, and then a series of incremental backups each recording only the changes made since the backup before it. Incremental backups are useful because they are fast and efficient.

Restoring an incremental backup requires retrieving the volumes of all backup sessions made before it, up to and including the full backup that started the chain. The longer the backup chain, the more time it will take to restore.

By adding a negative database override to /etc/tklbam/overrides:

# exclude drupal5 database
echo -mysql:drupal5 >> /etc/tklbam/overrides

# exclude sessions table in drupal6 database
echo -mysql:drupal6/sessions >> /etc/tklbam/overrides

By default ALL databases are backed up so adding a negative database override override excludes only that database or table from the backup.

By contrast, a positive database override changes the default behavior so that only the database or table specified in the override is included in the backup.

You can mix positive overrides with negative overrides.

By adding an override to /etc/tklbam/overrides:

echo /mnt/images >> /etc/tklbam/overrides

Make sure you understand the implications of doing this. For example, if you add a directory handled by package management this may break package management on the system you restore to.

By adding a negative override to /etc/tklbam/overrides:

echo -/var/www/*/logs >> /etc/tklbam/overrides

Every TurnKey appliance that TKLBAM supports has a corresponding backup profile, which is downloaded from the Hub the first time you backup an appliance. When required the profile can be updated on demand (e.g., if we need to fix the profile)

The profile is stored in /var/lib/tklbam/profile and contains the following text files:

  1. dirindex.conf: a list of directories to check for changes by default. This list does not include any files or directories maintained by the package management system.
  2. dirindex: appliance installation state - filesystem index
  3. packages: appliance installation state - list of packages

Users can override which files and directories are checked for changes by configuring overrides (See below).

By default TKLBAM is designed to work with S3 automatically, and this is the easiest and safest option. In manual mode, TKLBAM can also work with non-S3 storage addresses, but this complicates usage and carries additional risks which you should make sure you understand first to avoid data loss.

Of all non-S3 manual storage targets, the local filesystem is the simplest options because you don't need to mess around with authentication credentials.

So assuming you want to store your backup at /mnt/otherdisk:

tklbam-backup --address file:///mnt/otherdisk/tklbam/backup
tklbam-escrow /mnt/otherdisk/tklbam/key

And restore like this:

tklbam-restore --address file:///mnt/otherdisk/tklbam/backup 
               --keyfile=/mnt/otherdisk/tklbam/key

Not as easy as the Hub-enabled "automatic" mode, but still easier than a conventional backup process. Linux supports mounting most types of storage devices (e.g., external harddrive, local network file share) to the filesystem though this can require extra configuration at the operating system level.

One of the main disadvantage of using a local storage target, besides the more complicated setup and maintenance process is that you won't be able to restore/test your backup in the cloud, or from a VM running in another office branch (for example).

Also keep in mind that a physical hard disk, even a RAID array, provides much lower data reliability than the 11 nines (99.999999999%) of Amazon S3

For this reason we recommend users use local backups to supplement cloud backups (e.g., providing fast local access).

See also:

Amazon S3
TKLBAM doesn't store it's data in generic S3 buckets, but in an isolated TKLBAM-specific area on S3. This means generic S3 tools such as the AWS management console, or S3Fox will not be able to access the storage buckets in which TKLBAM backup volumes reside.
  1. Easier sign up process. Users don't need to know anything about S3 API keys or understand the implications of giving them to us.
  2. Security: you don't need to give us access to your generic S3 account. If someone compromises your regular AWS API Key they still can't get to your encrypted backup volumes and say... delete them.
  3. Cost transparency: TKLBAM related storage charges show up separately from your generic S3 storage.

Amazon supports payment by credit card and bank account. We recommend heavy users add a bank account as their payment method, as it's usually more permanent than a credit card.

In any case, if your payment method is invalidated (e.g., cancelled or expired credit card), billing will fail and Amazon will attempt to contact you (e.g., by e-mail) to provide a new, valid payment method.

No. You don't have to actually use S3, but it's the default so you'll still have to sign up for it as part of the registration process for TKLBAM on the TurnKey Hub. Payment for Amazon S3 is $0.15/GB a month. If you don't use it Amazon won't charge you.

In theory, any storage target supported by Duplicity can be forced by adding the --address option when you backup and restore, but consider yourself warned...

Here Be Dragons!

Doing this complicates usage as the Hub only helps you manage your backups when it auto-configures the storage address. If you specify a manual address you are on your own. You will need to manage backups, encryption keys and authentication credentials by hand. Many things can wrong so please be extra careful. Test your backups to make sure the restore works.

And remember, in manual mode the Hub doesn't save your encryption keys for you. If you lose the key, your backup is lost forever.

See also:

Fault tolerance

Yes and no. On one hand, much of the streamlined usability of TKLBAM depends on the availability of the Hub. On the other hand, we designed TKLBAM to degrade gracefully if the Hub ever goes down (it shouldn't!).

As we scale the Hub we will gradually add capacity and build in additional layers of fault tolerance.

We have monitoring in place which alerts us immediately if anything unexpected happens.

Yes. Backups which have already been configured will continue to work normally. If TKLBAM can't reach the Hub it just uses the locally cached profile and S3 address.

Yes - manually. It just won't be as easy. You'll need to do a couple of steps by hand:

  1. transfer the escrow key to the restore target.

    This means you'll need to have stored the escrow key somewhere safe or be able to create it on the backed up machine.

  2. specify the S3 address and the key manually when you restore.

    For more details see the tklbam-restore documentation.

Yes - but only manually. Just remember the Hub won't know anything about these backups so you'll have manage keys and authentication credentials by hand.