You are here
General
Backup and Migration (TKLBAM)
Usage
- Is there a limit to how frequently I can backup?
- How often does a full backup happen, how can I configure this?
- How do I exclude a database or table from my backup?
- What's the difference between a full backup and an incremental backup?
- I forgot my passphrase, and I "lost" my escrow key. Can you help me?
- How do I add a directory to my backup?
- How do I remove a file or directory from being included in my backup?
- How does TKLBAM know what to backup on my system?
- How do I backup to local storage (instead of S3)?
- How do I tune and optimize a TKLBAM backup?
- Can I have multiple TKLBAM backups on a single system?
- Can I use TKLBAM to only backup a single directory?
Amazon S3
- What are the advantages of isolating TKLBAM Amazon S3 storage?
- Why can't I access TKLBAM storage buckets with other Amazon S3 tools?
- Do I have to store my backups in the Amazon S3 storage cloud?
- What happens if my payment method to Amazon is invalidated?
- The Hub says my backup costs $0.00, what am I really paying?
- How much does cloud backup storage cost?
- How do I monitor how much traffic is being uploaded or downloaded?
- How can I throttle how much bandwidth TKLBAM uses?
General
TurnKey is 100% free software and by free we mean in the important sense of freedom not price. Free as in free speech, not free beer.
That means TurnKey is free from restrictive proprietary licensing, free from hidden backdoors and free to use, learn from, modify and distribute. See the licensing page for full details.
Without the values and fruits of free software, the Internet wouldn't exist and neither would this project. If you care about living in a free society, enjoying free speech and the freedom of the Internet - you should care about free software too.
Read more:
-
Selection: the largest free software repository. Over 37,500 packages.
-
Security: all packages are supported with carefully backported security updates that can be safely installed automatically.
-
Stability: Debian has a well deserved reputation for rock solid stability. Other distributions are often not even comparable to the Debian testing branch.
-
Free: Debian is 100% free software. Free from hidden backdoors, free to use, learn from, modify and redistribute.
-
Community: Debian is powered by the world's oldest, largest and most vibrant free software non-profit organization. Debian democratically governs itself using the Debian Social Contract as its constitution.
There are over 1000 passionately committed Debian Developers worldwide. Debian also has the largest ecosystem of derivative distributions. The focus is on development not marketing, which has created a paradoxical situation in which the branding of Debian based distributions such as Ubuntu is better known in certain circles (e.g., commercial industry) than Debian itself.
-
No central point of failure: many other commercially sponsored Linux distributions have a central point of failure. They are largely dependent on the success and continued independence of their commercial sponsor in a competitive marketplace. As a non-profit organization Debian can not fail in the marketplace or get bought out. Debian has been around for more than 20 years and it will be around in another 20. The same can not be said for other Linux distributions.
Who are you guys?
TurnKey GNU/Linux was founded circa 2008 by Alon Swartz and Liraz Siri. Like many aspiring young hackers of their generation, they grew up with the Internet and were hungry to unlock its secrets. They soon discovered free software and the community of highly talented philanthropist programmers behind it, without which the Internet as we know it would not be possible.
This was a revelation and a deeply transformative educational experience that would eventually inspire TurnKey years later, first as a side project and later as full time mission to give back to the tradition that gave so much to them while maximizing the amount of good they could do with the resources they had to work with.
Circa 2009, Jeremy Davis stumbled upon TurnKey and became part of the team, first as a community volunteer, then as a core team member. If you post on the forums, you'll almost certainly come across him! :)
Why is promoting free software important? Because the benefits of free software extend well beyond price or economical utility. It's about freedom. Freedom to have full control over your computing. Freedom to inspect the inner workings of the software you use. Freedom from hidden NSA backdoors. Freedom to learn and innovate without having to ask anyone for permission.
Because unlike with proprietary software, free software is free to use, modify and distribute. You can make it better! It comes with full source code. There is no secret sauce, nothing you are not allowed to understand. You can learn how computers work inside out by studying source code for any component of the GNU/Linux operating system. It's a priceless educational resource that allows aspiring hackers to learn from some of the best programmers in the world. Artists who love their craft so much they literally do it for free. Who apply to software engineering the noble scientific tradition of open, borderless collaboration that underlies modern science and technology.
See also:
In a nutshell: trust, but verify.
Trust (the short version): we have a reputation to maintain and are not naive regarding the risks. Since 2008, we've poured our hearts and souls into TurnKey. Over a million images have been downloaded. It's a free software project. Full source code for everything is available. You can use the build system to build any solution in the library from scratch. There's no place for bad stuff to hide. If there was any funny business going on, someone would have surely discovered our evil plans by now.
Verify: Since TurnKey GNU/Linux solutions are built mostly from from unmodified Debian binaries, it is possible for anyone to verify the integrity of the binaries that make up a solution against the original package signatures from the official Debian repositories.
Custom TurnKey packages are updated from our cryptographically signed package repository. Full source code for all custom components is available on GitHub, and so is the source code to all the appliances.
To prevent tampering, we sign all releases so that users can cryptographically verify the integrity of their downloads. Also, our virtual appliances are configured to automatically verify the cryptographic integrity of any package (including custom components) that is installed through the package management system (e.g., automatic security updates).
In other words, users should be able to trust a TurnKey installation as much as they trust a normal general-purpose installation of Debian.
If there is anything else we can do to satisfy our more paranoid users, please let us know.
Backup and Migration (TKLBAM)
Overview
If you're using any TurnKey derived system, you don't need to install it as TKLBAM is a bundled into the TurnKey Core.
If you're using a generic Debian or Ubuntu derived system you can install it with the following shell command:
wget -O - -q \ https://raw.github.com/turnkeylinux/tklbam/master/contrib/ez-apt-install.sh \ | PACKAGE=tklbam /bin/bash
This adds the TurnKey package repository to your APT sources and uses APT to install the tklbam package and its dependencies.
Using TKLBAM on TurnKey Linux provides the best experience but it will also work well with any Debian or Ubuntu derived system and even with other Linux distributions if you install from source and use the --skip-packages option to disable integration with APT, the Debian package manager.
When you use TKLBAM with TurnKey Linux it takes advantage of the known fixed installation state to make the smallest possible backup. For example, it will only backup /etc configuration files that you have changed. This makes migration easier by increasing visibility into what actually changed. By comparison on a generic Debian or Ubuntu system it will backup all /etc configuration files.
Pretty much anything, though storing backups to Amazon S3 is easiest because authentication and key management are automatic. You just need to run:
tklbam-backup
But you can also backup to any storage target supported by TKLBAM's back-end Duplicity including the local filesystem, NFS, Rsync, SSH, FTP, WebDAV, Rackspace CloudFiles and even IMAP.
The local filesystem is one of the easier storage targets to use because you don't need to mess around with authentication credentials.
So assuming you want to store your backup at /mnt/otherdisk:
tklbam-backup --address file:///mnt/otherdisk/tklbam/backup tklbam-escrow /mnt/otherdisk/tklbam/key
And restore like this:
tklbam-restore --address file:///mnt/otherdisk/tklbam/backup \ --keyfile=/mnt/otherdisk/tklbam/key
Not as easy as the Hub-enabled "automatic" mode, but still vastly easier than your conventional backup process. The disadvantage is that you won't be able to restore/test your backup in the cloud, or from a VM running in another office branch (for example). Also keep in mind that a physical hard disk, even a RAID array, provides much lower data reliability compared with Amazon S3.
For this reason we recommend users use local backups to supplement cloud backups (e.g., providing fast local access).
Currently, only MySQL and PostgreSQL have built-in support but TKLBAM can work with other databases so long as you configure custom serialization/unserialization procedures in a hook script.
Usage
No, but if you backup more frequently (e.g., hourly instead of daily), we strongly recommend creating full backups more frequently - daily or weekly instead of monthly (I.e., the default).
The reason is that long backup chains are inefficient and more vulnerable if something goes wrong as links in the chain depend on one another.
If you backup daily, and do a full backup monthly, your backup chains will consist of a full backup with at most 31 incremental backups linked to it. But if you backup hourly, by the end of the month your backup chain could consist of up to 744 incremental backups, all of which have to be downloaded and extracted when you restore.
To configure automatic hourly backups with a full backup every 7 days:
mv /etc/cron.daily/tklbam-backup /etc/cron.hourly chmod +x /etc/cron.hourly/tklbam-backup echo full-backup 7D >> /etc/tklbam/conf
By default, a full backup will happen if the last full backup is older than 30 days. Between full backups, all backup sessions are incremental.
We recommend enabling the daily backup cron job so that daily incremental backups happen automatically:
chmod +x /etc/cron.daily/tklbam-backup
You can override the default by setting the full-backup parameter in the tklbam configuration:
# create a full backup every 14 days echo full-backup 14D >> /etc/tklbam/conf
By adding a negative database override to /etc/tklbam/overrides:
# exclude drupal7 database echo -mysql:drupal7 >> /etc/tklbam/overrides # exclude sessions table in drupal8 database echo -mysql:drupal8/sessions >> /etc/tklbam/overrides
Or on the command line:
tklbam-backup -- -mysql:drupal6/page_cache
By default ALL databases are backed up so adding a negative database override excludes only that database or table from the backup.
Excluding a table only excludes its data. The schema is still backed up as long as the database is included.
Specifying a positive database override changes the default behavior so that only the database or table specified in the override is included in the backup.
You can mix positive overrides with negative overrides.
A full backup is a backup that can be restored independently of any other backup. An incremental backup links with the last backup before it and only includes changes made since the previous backup.
Backup chains are links of backup sessions which start with a full backup, and then a series of incremental backups, each recording only the changes made since the backup before it. Incremental backups are useful because they are fast and efficient.
Restoring an incremental backup requires retrieving the volumes of all incremental backup sessions made before it, up to and including the full backup that started the chain. The longer the backup chain, the more time it will take to restore.
Sorry, if your server is gone (e.g., terminated EC2 instance) nobody can help you. Next time either save an escrow key somewhere[s] safe or don't set a passphrase.
Don't misunderstand, we'd love to help if we could, but we can't. The encryption key for your backup was generated locally on your server not ours. We designed passphrase protection to use special cryptographic countermeasures to make typical cracking techniques (e.g., dictionary attacks) very difficult even for someone with access to massive amounts of computer resources.
Note, if the system you backed up is still available, just log into it as root and change the passphrase (you don't need to know the old passphrase):
tklbam-passphrase
By adding an override to /etc/tklbam/overrides:
echo /mnt/images >> /etc/tklbam/overrides
Or on the command line:
tklbam-backup /var/www/*/logs
Make sure you understand the implications of doing this. For example, if you add a directory handled by package management this may break package management on the system you restore to.
By adding a negative override to /etc/tklbam/overrides:
echo -/var/www/*/logs >> /etc/tklbam/overrides
Every TurnKey appliance that TKLBAM supports has a corresponding backup profile, which is downloaded from the Hub when you initialize TKLBAM. The profile is used to calculate the list of system changes we need to backup. It usually describes the installation state of a TurnKey appliance and contains a list of packages, filesystem paths to scan for changes and an index of the contents of those paths which records timestamps, ownership and permissions.
You can also generate your own custom profiles with the following command:
tklbam-internal create-profile
The backup profile is stored in /var/lib/tklbam/profile and contains the following text files:
- dirindex.conf: a list of directories to check for changes by default. This list does not include any files or directories maintained by the package management system.
- dirindex: appliance installation state - filesystem index
- packages: appliance installation state - list of packages
Users can override which files and directories are checked for changes by configuring overrides (See below).
By default TKLBAM is designed to work with S3 automatically, and this is the easiest and safest option. In manual mode, TKLBAM can also work with non-S3 storage addresses, but this complicates usage and carries additional risks which you should make sure you understand first to avoid data loss.
Of all non-S3 manual storage targets, the local filesystem is the simplest options because you don't need to mess around with authentication credentials.
So assuming you want to store your backup at /mnt/otherdisk:
tklbam-backup --address file:///mnt/otherdisk/tklbam/backup tklbam-escrow /mnt/otherdisk/tklbam/key
And restore like this:
tklbam-restore --address file:///mnt/otherdisk/tklbam/backup --keyfile=/mnt/otherdisk/tklbam/key
Not as easy as the Hub-enabled "automatic" mode, but still easier than a conventional backup process. Linux supports mounting most types of storage devices (e.g., external harddrive, local network file share) to the filesystem though this can require extra configuration at the operating system level.
One of the main disadvantage of using a local storage target, besides the more complicated setup and maintenance process is that you won't be able to restore/test your backup in the cloud, or from a VM running in another office branch (for example).
Also keep in mind that a physical hard disk, even a RAID array, provides much lower data reliability than the 11 nines (99.999999999%) of Amazon S3
For this reason we recommend users use local backups to supplement cloud backups (e.g., providing fast local access).
See also:
One of my favorite ways to do this:
# step 1: generate a backup dump tklbam-backup --dump=/tmp/mybackup # step 2: interactively review the dump's file contents & disk usage cd /tmp/mybackup apt-get install ncdu ncdu # step 3: add includes or excludes, go back to step 1, rinse, repeat vim /etc/tklbam/overrides # Everything perfect? tklbam-backup --upload-raw=/tmp/mybackup
By default, TKLBAM will automatically determine what paths and databases need to be backed up on a given TurnKey system according to the backup profile it gets from the Hub. The default profile tracks changes to the user-servicable, customizable parts of the filesystem (e.g., /etc /root /home /var /usr/local /var /opt /srv) while ignoring changes in areas maintained by the package management system.
You can "override" the default backup profile configuration by specifying overrides, either on the command line, or preferably by editing the /etc/tklbam/overrides configuration file.
Yes. For example, let's say your default TKLBAM backup is several gigabytes in size and you'd like to create a lighter 100 MB backup that will be updated more frequently and take less time to update/restore:
cp -a /etc/tklbam /etc/tklbam.light echo -/var/www/\*/logs >> /etc/tklbam.light/overrides echo -/home/liraz/bigfiles >> /etc/tklbam.light/overrides echo -mysql:mydatabase/bigtable >> /etc/tklbam.light/overrides export TKLBAM_CONF=/etc/tklbam.light mkdir /var/lib/tklbam.light export TKLBAM_REGISTRY=/var/lib/tklbam.light tklbam-init tklbam-backup
For convenience you may want to create a script that sets the TKLBAM_REGISTRY and TKLBAM_CONF environment variables:
cat > /usr/local/bin/tklbam-backup-light << EOF #!/bin/bash export TKLBAM_CONF=/etc/tklbam.light export TKLBAM_REGISTRY=/var/lib/tklbam.light tklbam-backup EOF chmod +x /usr/local/bin/tklbam-backup-light
Yes. Here are a couple of recommended ways to do this:
-
Create a separate backup with an empty backup profile:
export TKLBAM_REGISTRY=/var/lib/tklbam.srv export TKLBAM_CONF=/etc/tklbam.srv tklbam-init --force-profile=empty tklbam-backup --skip-packages --skip-database -- /srv
-
Use the --raw-upload option
This lobotomizes TKLBAM so instead of creating a system level backup it just backs up the directory you specify. In other words, --raw-upload turns TKLBAM into a directory-level backup tool rather than a system-level backup tool.
For example, let's say you have a collection of big files at /srv that you don't want to include in your system backup (e.g., because you don't want to bloat your backup).
So you configure an overrides to exclude the /srv directory from your default backup and create another TKLBAM backup just for the big files:
echo -/srv >> /etc/tklbam/overrides export TKLBAM_REGISTRY=/var/lib/tklbam.srv-raw tklbam-backup --raw-upload=/srv
Later, you'll need to use the --raw-download option to restore:
tklbam-restore --raw-download=/srv <your-backup-id>
If you don't use the raw-download option, TKLBAM will assume you are trying to restore a system-level backup and you'll get an error.
Amazon S3
- Easier sign up process. Users don't need to know anything about S3 API keys or understand the implications of giving them to us.
- Security: you don't need to give us access to your generic S3 account. If someone compromises your regular AWS API Key they still can't get to your encrypted backup volumes and say... delete them.
- Cost transparency: TKLBAM related storage charges show up separately from your generic S3 storage.
Please note that new(er) Hub accounts DO use generic S3 buckets for backup storage. We will be transitioning existing users to the new system in the future. We will contact users individually when the time comes.
No! TKLBAM stores backups in the cloud for convenience, but it also supports local / custom backup storage targets.
There are two main alternatives to letting TKLBAM store a backup in the cloud:
-
Low-level tklbam-backup --dump option: lets you dump the raw TKLBAM backup extract to a directory, which you can then store anyway you like.
For example here's how we'd a system backup into a simple unencrypted tarball:
cd /tmp mkdir mybackup tklbam-backup --dump=mybackup/ tar jcvf mybackup.tar.bz2 mybackup/
And later restore it like this:
cd /tmp tar jxvf mybackup.tar.bz2 tklbam-restore mybackup/
The --dump option bypasses Duplicity, which usually create a series of encrypted archive files that can be incrementally updated. These archive files are stored by default in the Amazon S3 storage cloud but you can override this with the --address option and specify any storage back-end supported by Duplicity (e.g., local directory, rsync over ssh, ftp, sftp, etc).
-
High-level tklbam-backup --address option: lets you specify a custom backup target URL that is passed on to Duplicity.
It is highly recommended to rehearse a trial restore. Testing your backups is always a good idea, and even more so with a custom --address as this may complicate usage.
The Hub normally helps you manage your backup's metadata when it auto-configures the storage address. If you specify a manual address you need to manage storage locations, encryption keys and authentication credentials by hand.
Amazon supports payment by credit card and bank account. We recommend heavy users add a bank account as their payment method, as it's usually more permanent than a credit card.
In any case, if your payment method is invalidated (e.g., cancelled or expired credit card), billing will fail and Amazon will attempt to contact you (e.g., by e-mail) to provide a new, valid payment method.
If you notice $0.00 in the backups console, there's no need to open a support request. It's not a bug. At 15 cents per gigabyte, if you have just a few megabytes of data Amazon doesn't charge you anything.
Backups start from around 100KB for a freshly installed TurnKey appliance. Remember, TKLBAM only saves changes you've made since the appliance was installed.
Amazon S3 cloud storage fees are around $0.15/GB per month.
You can use simulation mode to calculate how much uncompressed data TKLBAM is going to store in a full backup:
$ tklbam-backup --simulate CREATING /TKLBAM FULL UNCOMPRESSED FOOTPRINT: 148.30 MB in 6186 files
In practice, the actual footprint of a full backup will usually be smaller due to compression, but this depends on the type of data being compressed (e.g., text compresses very well, video very poorly).
By default, a full backup is performed if one month has passed since the last full backup. In between, incremental backups will be performed which only record changes since the last backup. The full backup frequency can be customized. See the manual page for details.
Fault tolerance
Yes. Backups which have already been configured will continue to work normally. If TKLBAM can't reach the Hub it just uses the locally cached profile and S3 address.
No, for a couple of reasons:
- After the initial setup, TKLBAM communicates directly with Amazon S3. Even if the Hub does down, backups will not be interrupted.
- You can use TKLBAM without linking it to the Hub at all. See the tklbam-init --solo option.
Yes - manually. It just won't be as easy. You'll need to do a couple of steps by hand:
-
transfer the escrow key to the restore target.
This means you'll need to have stored the escrow key somewhere safe or be able to create it on the backed up machine.
-
specify the S3 address and the key manually when you restore.
For more details see the tklbam-restore documentation.
Yes - but only manually. Just remember the Hub won't know anything about these backups so you'll have manage keys and authentication credentials by hand.