Blog Tags: 

How TKLBAM hooks work

Most TKLBAM users probably don't realize this, but TKLBAM has a nifty, general purpose hooks mechanism you can use to trigger useful actions on backup and restore.

Examples of hooks:

  • Cleaning up temporary files
  • Stopping/starting services to increase data consistency
  • Encoding/decoding data from non-supported databases
  • Using LVM to create/restore a snapshot of a fast changing volume

Originally I developed the hooks mechanism so we could fix a few issues indirectly related to the usability of TKLBAM. In particular, our very first beta users reported that sometimes tklbam-restore would fail to find any backup volumes. When we investigated this turned out to be a clock discrepancy. The obvious solution was to sync the clock before starting the restore, but the more I thought about it the more the idea of hardwiring that ntpdate stuff rubbed me the wrong way. For a few reasons:

  • It's an auxiliary problem, not a core issue with TKLBAM's logic
  • I'm offline much of the time during development so I needed some way to turn this off, but I don't want to add more testing-specific code unless it's absolutely necessary.
  • It's OK if a specific server (e.g., pool.ntp.org) is the default, but there should be some way to configure it if a user, for example, wants to use an internal NTP server.

I tried to think of a clean way to achieve these simple goals in a clean way (e.g., cli options, environment variables, configuration files), but everything I came up with was just so darn ugly.

Then I realized that a hooks mechanism would solve this problem in a simple, generic way.

Implementation

/etc/tklbam/hooks.d may contains executables (e.g., scripts) that will be run by tklbam before and after two operations (currently):

  1. backup
  2. restore

Two arguments are passed to the hooks:

  1. operation: restore/backup
  2. state: pre/post

Non zero exitcodes raise a HookError is raised.

Advantages

In one stroke, solve the clock problem and also lets advanced users define their own hooks to take care of things TKLBAM doesn't (e.g., stopping IO intensive processes before backup, encoding/decoding unsupported databases, etc.)

Example fixclock hook

	#!/usr/bin/python
# hook that runs ntpdate before duplicity to sync clock to UTC

import os
import sys
import executil
from string import Template

NTPSERVER = os.environ.get("NTPSERVER", "pool.ntp.org")

ERROR_TPL = """\
##########################
## FIXCLOCK HOOK FAILED ##
##########################

Amazon S3 and Duplicity need a UTC synchronized clock so we invoked the
following command::

    $COMMAND

Unfortunately, something went wrong...

$ERROR
"""

def fixclock():
    command = "ntpdate -u " + NTPSERVER

    try:
        executil.getoutput(command)
    except executil.ExecError, e:
        msg = Template(ERROR_TPL).substitute(COMMAND=command,
                                             ERROR=e.output)

        print >> sys.stderr, msg,
        sys.exit(1)

def main():
    op, state = sys.argv[1:]

    if op in ('restore', 'backup') and state == 'pre':
        fixclock()

if __name__ == "__main__":
    main()

Comments

Jeremy Davis's picture

I was wondering how that all works. Good to see some documentation of it. I think this needs to go in the docs too. If I get around to it, I'll do it but got lots on ATM so won't make any promises right now! :)

Also one very relevant question which has come up a couple of times on the forums and I don't think has been answered yet: How to backup an unsupported appliance? I think it is appropriate that by default unsupported appliances (such as LAPP) throw an error when trying to use TKLBAM. However using these instructions it is possible for end users to tweak the settings so they can use the TKLBAM hooks to customise their TKLBAM backups eg to dump and restore a non-supported DB. But how does one work around the fact that the Hub doesn't recognise the appliance as supported? Ie what needs to be hacked to trick the Hub into accepting the backup?

And while we're at it, other TKL things I'd love to see some documentation of and/or blog post about are:

  • Usage of etckeeper - I have had a bit of a dig around online and found that documentation of etckeeper is pretty thin. A helpful passerby, fixed me up with a command to purge the repo of old entries (when I ended up with a 500MB backup which was mostly stuff from Webmin logs which IMO it innapropriately stores in /etc) but I have yet to come across further info on things like how to restore old entries. I'm assuming one would use git, but it'd be great if someone could document it a little better (in case I need to use it one day).
  • Useage of LVM snapshots - This is another great default feature of TKL appliances that is under documented IMO and would probably be more used were it better documented here. This one I'm sure has plenty of info online, but again I think it'd be great if it was spelled out for TKL users.
  • Some useful TKLM tweaks - Such as suggested hooks for backups or handy exclusions/inclusins for different usage scenarios (such as excluding/including selected logs which may be useful in some scenarios, or not in others). This one is perhaps more for the docs/wiki and is definately something I could contribute to. I think it'd be great to have some suggested tweaks using the TKLBAM hooks. Such as purging the etckeeper cache (ie git repo) prior to backup (to avoid exponentially growing backups for machines that have been running for a while).
Liraz Siri's picture

Documentation: Yeah, there's kind of an issue here with stuff that belongs in the documentation vs blog material. Maybe we could set up a resources page, or tips and tricks page that links to useful blog posts.

Or when creating a new page isn't the right thing to do we could just edit the most relevant documentation pages and add links there.

Regarding Postgres: Adding Postgres support to TKLBAM has been on my todo list for too long now. I don't want to sound like a broken record but I'll be getting to it real soon now.

If anyone wants to document instructions for getting over the current TKLBAM limitations to kind-of sort-of support Postgres based appliances (and other stuff) they're very welcome to have a go at it, but I can do better. This year I've had way too many distractions from development, which is a shame because I like doing that much better than the boring distractions. I already know what my next year's resolution is going to be...

Regarding etckeeper: We know about the issue, and a fix has already been committed to the upcoming maintenance version. As for currently installed versions the workaround is very simple and is documented in the etckeeper README

cd /etc/etckeeper/post-install.d
(echo '#!/bin/sh' ; echo 'exec git gc') > 99git-gc
chmod +x 99git-gc
git add .
git commit -m "run git gc after each apt run"

We could push out an automatic update that commits this fix to everyone using the current version of TurnKey. That's something to think about...

Regarding LVM: The blog post is the first thing that comes up if you search for LVM on the TKL website.

Jeremy Davis's picture

Docs: Yes the idea of putting links back to the blogs in the docs would be a good idea IMO. And perhaps instead of having another dev contest we could have a doc contest?! :)

LAPP appliance: So how do you get the Hub to accept a backup from the LAPP (or any other unsupported) appliance? I haven't tried it myself, but the consensus seems to be that it stops dead in it's tracks because it doesn't have a Hub profile. It'll be great when you do it properly, but in the meantime that is the missing TKLBAM-specific piece of the puzzle (at least from my perspective). All the rest of it could be worked out I'm sure.

etckeeper: Cool but I also think that part of the problem is that Webmin stores so much stuff in /etc. Without Webmin leaving all it's logs in there, etckeeper buildup wouldn't be such an issue. IMO it should go to somewhere like /var/webmin/logs or something... I've actually started doing that. I can't recall OTTOMH but I'm pretty sure I posted what I did. Which was to move the relevant folder (to /etc/webmin/) and create a stylink to it in /etc/webmin/ That reduced my /etc (and hence my TKLBAM backup) by nearly 50MB. Now 50MB isn't much these days as far as storage is concerned but it is still a sizeable amount of data to upload, especially if you have a slower connection.

Liraz Siri's picture

etckeeper: excellent idea regarding moving webmin logs to /var/logs. We'll look into it for the next maintenance version. In the meantime if you don't want TKLBAM to backup etckeeper you can add an override to exclude /etc/.git.

LAPP: Yeah, the profiles are missing, but they're missing for a reason. TKLBAM doesn't know (yet) how to serialize/unserialize Postgres databases. I could generate a profile for LAPP/Postgres in a couple of minutes, but if TKLBAM doesn't backup your database you might think your appliance was backed up when one of the most important parts wasn't...

Docs: A doc contest would be a pretty neat idea, though the terms of the contest for collaboratively written documentation would need some thought...

Jeremy Davis's picture

I understand your rationale for not having a LAPP profile. And I agree that it's better that by default it errors as I don't think there would be much worse than a user thinking they have backed up their appliance only to find they have no DB backup when it counts (obviously users should test their backups before they need them but human nature being what it is that's not a good assupmtion to make IMO).

But as it currently stands, (by my understanding - please correct me if I'm wrong) even if a user sets up their own hooks, the LAPP appliance will still just throw a error when trying to backup (because no profile exists).

So the question is: How do you trick the Hub into thinking that your LAPP appliance is say a Core appliance (and then you can set your own additional hooks and includes)? In other words, how can you make the Hub use the Core TKLBAM profile for the LAPP appliance?

Liraz Siri's picture

What you say is entirely true. Currently, you do in fact need to trick TKLBAM into backing up the LAPP appliance. It's very easy actually:

echo turnkey-core-11.2-lucid-x86 > /etc/turnkey_version

That will make TKLBAM report that it is in fact TurnKey Core, which will get the profile for Core from the Hub. You do this at your own peril of course, but if you're careful and the restore works (e.g., on Core, or from a similarly hacked LAPP) you're good to go.

Jeremy Davis's picture

I promise I'll be careful.

Jeremy Davis's picture

Firstly - great work on the 30 min Moodle & Mahara install! That's pretty impresive IMO! :)

IMO an integration with BligBlueButton as well would make this a killer e-learning appliance!

But to my opinion on your Qs:

  1. Assuming that Mahara needs PostgreSQL (which I assume it does - otherwise you'd just use LAMP stack...) then I would think so... If it will run with MySQL then perhaps the Moodle appliance is a better place to start? (AFAIK TKL Moodle has upstream Moodle installed to LAMP stack).
  2. Sorry don't quite get this question... TKL appliances (at least the EBS backed ones) allow the choice of what size EBS volume to use (regardless of instance size). So the smallest size instance that allows it to work properly (given the expected load etc) with the smallest EBS you think you'll need for storage will be the cheapest option... (Although likely I missed the actual point of your Q)
  3. I don't recall anyone actually spelling it out... But this blog post (and the comments) seem to provide all the info (although I guess actually attempting it will be the judge of that...)
  4. You could do that, but Liraz provides info on hacking TKL PostgreSQL (or any other appliance that uses PostgreSQL and is therefore 'unsupported by TKLBAM) to appear to the Hub as a supported appliance (e.g. Core) therefore allowing TKLBAM to run... Obviously you'll still need to provide the TKLBAM hook to dump/restore the PgSQL DB.
  5. AFAIK you have 2 options... Buy an elastic IP and associate that to your appliance (in effect a static IP) and use your domain name host's DNS config to point to the elastic IP (as you would any server with a static IP) OR use HubDNS with a 'custom' domain name (I've never done it, but under 'EC2 Account' in the Hub there is a link to 'Add custom domain' - the help links to the HubDNS blog post for more info). My personal preference would probably be the former but perhaps that's just me... Also I haven't done the sums on the price differences, I guess that would be a significant factor.

Look forward to hearing how it all goes. And BTW congrats on your new job! :)

L. Arnold's picture

I am trying to upgrade to Wheezy.  I have the new TKLBAM loaded and the system will take a "core" Profile, but I cannot find or get the proper Posgres Profile to load.

Is there a easy way to generate one and get the profile loaded on the Hub?

thanks for any and all  help.

Ronan0's picture

I would like to test TKLBAM with some (embedded) H2 database applications. Thanks.

Jeremy Davis's picture

For some reason I have not been getting notifications of comments on blog posts...

Anyway, WRT your previous (old) question; tklbam hooks are essentially just bash scripts. So the process would be to start on the commandline of your appliance and work out what commands achieve the ends you are after (e.g. dump a DB to a file). So long as you put the file somewhere that tklbam is already backing up (or explicitly make sure tklbam is backing up where the data is) then you should be good to go.

Don't forget though that you'll also need to create a tklbam hook to restore the data!

Finally please feel free to share your code. It's likely someone else may be trying to achevie the same ends and you might save them some time! :)

Jeremy Davis's picture

Then it should be as simple as:

tklbam-backup

For more info, please see the docs. If you need more of a hand, please sign up for an account and start a new thread in the forums.

Pages

Add new comment