TurnKey Linux Virtual Appliance Library

TKLBAM and Hub features

Chris Musty's picture

Hi TKL users and devs,

This post is intended to be a discussion on how TKL and TKLBAM have benefitted me and some features I would like to see.

If this is served better in another place then please let me know, otherwise I will start by saying I am very impressed so far and look forward to continuing development.

I write software for windows and Linux that is 90% of the time a database frontend and consequently I provide onsite and offsite servers that include file servers, db servers and printing servers (so people can manage printing costs).

I have used TKLBAM for several MySQL servers in a test scenario to ensure it can do what I need. I plan to use it for my first full scale backup in the comming week. Twice it has saved me hours of work recreating tables when a VM went KAPUT!

My testing is not estensive and there may be features I have missed but for what I intend to do I have tested thoroughly.

The following features I would like to see;

  1. Backup statistics in hub or updated in the server CLI itself.
    My motivation behind this is to determine how long a large backup might take and the ease of mind provided by "numbers"
  2. Ability to cancel and delete a "backup in progress" in the HUB
  3. Ability to reuse numbers from deleted backups in the HUB
  4. Ability to group backups into "Folders" or other management methods.
  5. Postgres to work with TKLBAM
  6. Micro instances to become available in cloud deployments
  7. A light version of TKLBAM that backs up configuration scripts, settings etc
  8. Several new Turnkey appliances - I have posted a few in the blueprints section
  9. More as I think of them

I think the best way to help this occur is to evangelise the product, which I am doing with my network of colleagues. If there are practical exmples of ways to provide help I would be limited (due to the need to earn money!) but will do what I can where I can.

Its funny that I found TKL when I was researching some apps I was going to use to create my own image templates - a welcomed surprise!

Alon Swartz's picture

Comments

The general forum is the perfect place for this type of feedback.

Backup statistics: Not sure how TKLBAM can estimate how long duplicity will take to perform the upload. This might be something that can be hacked into duplicity itself. I recall chatting to Liraz about this, but I don't recall his answer unfortunately...

Delete backup-in-progress: It's on my todo list.

Reuse numbers from deleted backups: Can you elaborate. If backups are deleted they are deleted, along with all associated data.

Grouping backups: An idea that has been tossed around in the past is to support tagging in backups / servers, and provide a filter listing option. Would that solve your issue?

TKLBAM postgres support: High on Liraz's todo list.

Micro / EBS backed instances: Work in progress, and will probably be included with the upcoming TKL 11.2 maintenance release.

TKLBAM light: Can you please elaborate? Keep in mind that you can override the backup profile to include/exclude databases and paths.

More TKL appliances: TKL 11 part 2 is long overdue, but very high on our todo list as well. If you're interested in pushing certain appliances forward, having a TKLPatch available will most certainly help.

Chris Musty's picture

Elaborating on original posts

Backup statistics: (CLI)

  • number of "chunks" (for use of a better word) that will need to be processed given everything that is to be backed up. Eg if I have 50Gb and chunks are 50Mb then I would expect to see "uploading chunk 1 of 1024" and incrementing the chunk number being uploaded as it currently does. Currently all I see is "Uploading [chunk] 214"
  • Time the last chunk took to upload or an average time to upload etc - only needs to be calcuklated on the VM that is backing up not from amazon. Crude I agree but better than just telling me what chunk it is currently up to with no idea on how many are left.
  • Upload speeds and general info like when you use wget [======>45%          1016 Kbps] etc
  • Somehow communicating with the hub to see updates on number of uploaded chunks would be good too.

Reuse numbers from deleted backups:

Here is a screen dump to demonstrate what I am talking about. I must admit I do not understand the inner workings so this might not even be possible.

screenshot

as you can see there are many holes where I deleted a backup. Is it possible that when the next new backup is generated it fills the first available spot rather than the next incremental spot? Also I understand this may cause issues with backups but can it also be possible to move (in this example) say #16 to #2?

Grouping backups:

Filter listings can be good and possibly for you easier to implement but I prefer to categorise clients and their backups in a hierachial type system much like windows explorer or the standard "tree" control in a GUI might work. Either way as I use this more and more I will have backups flowing for pages at a time. A choice for peoples preferences would be good too.

TKLBAM light:

I understand the override feature for including/exluding but there was something specific I was after with this which I just cannot remember. I will have to revisit the test I was performing earlier and report back.

[edit] I must be on drugs when I wrote this! what I was actually after is a message to a syslog server or an email confirming success. While the HUB interface is good I dont really want to hunt through a huge list making sure everything backed up. I want to use my own logging server that throws alerts when it isn't updated with correct and timely responses. 

Chris Musty

Director

Specialised Technologies

Liraz Siri's picture

Thoughts

Progress reports/backup status: TKLBAM wraps around Duplicity on the back-end. I'll have to take a closer look to see if this is even possible. It may not be without significantly changing how Duplicity works. For example, I'm pretty sure it doesn't calculate in advance the number of blocks that need to be uploaded. But maybe it should...

Re-using backup ids: TKLBAM intentionally doesn't re-use old backup ids because it's not safe. Otherwise an old VM you forgot about (or bring back to life) may overwrite a "new" backup.

However, the order the backups show up as in the Hub could conceivably by controlled separately though from the backup id. It's a UI issue.

Grouping: we've always understood that as people use TKLBAM more heavily we may have to introduce better management interfaces than a flat list. At the same time we would still like to be very careful about not complicating the interface for new users. Whatever solution we eventually go with will have to balance the needs of new users with more advanced users.

TKLBAM reporting: I get it. I've put this down in my todo list and will think about it some more later.

Jeremy's picture

My 2c

A couple of the things that Chris mentioned have got me thinking...

The idea of optional TKLBAM confirmation emails, and/or some sort of logging (that could log to an unyet designed TKL log server) would be cool I think.

Please correct me if I'm wrong but what I take from Chris' suggestion of TKLBAM-light is as basically to allow common site-wide config. eg at home I like to use my ISP mirror for Ubuntu repos so rather than setting each image individually I could just run my generic TKLBAM-site-setup. This could be also used to configure other site settings and standard software that you want on all machines.

Another little thing: It'd be great if I could set my Hub default eg for me here in Australia I'd like to set SE Asia as my default location.

And of course I'm loving TKL too, but you guys know that already! :)

Chris Musty's picture

Thought about making a syslog appliance

It has long been on my to do list as every syslog server I use is built from scratch and hobbled together.

I have been considering making this with a custom web interface (as I find free syslog gui projects very limited and rare unless you pay - does anyone have a suggestion on a free gui?) that runs from the server itself. This will obviously not be a high traffic server but should sustain mabey 10,20,30 possibly 50-100 inputs per second?

I certainly use syslog servers allot so I assume others would too.

Would be interesting making my first TKL appliance.

Guess I need to read up on how its all done properly!

Chris Musty

Director

Specialised Technologies

Alon Swartz's picture

@jeremy, could you expand on

@jeremy, could you expand on the idea of TKLBAM confirmation emails.

@chris, @jeremy - A TKL logging server is interesting, if someone could come up with the TKLPatch that would be awesome.

I'd like to get Liraz's thoughts on TKLBAM-light, it's an interesting idea...

Lastly, the Hub does (or at least should) set the default Amazon region for both your backups and when launching appliances, based on your IP. If it's not setting it, thats a bug and needs to be fixed.

Jeremy's picture

Just a simple confirmation email

TKLBAM Email:
I think it'd be a nice touch to have the Hub (or even my local server - not sure which would be best) send me an email when it's done it's thing. Be great if that could be configurable, probably off by default, but also configurable to email on failure only, or email on full backup complete (as opposed to incremental), etc.
 
TKL Log Server:
Yes I think a log server is a great idea but I won't probably be doing much on this for 2 reasons; firstly although I think it'd be cool, I have lots of other stuff on my todo list first. And secondly, I don't think I know enough about to to build something that would be really useful. Especially in light of Chris' statement that he hasn't been able to find a nice WebUI for what he's after.

Having said that I'm more than happy to help out Chris if there is anything I can help you with let me know. TKLPatch is pretty well documented and with a  basic understanding of Linux is pretty easy to use. In essence it's an install script(s) with a file overlay (files to be overlayed over the root filesystem) that can then be applied to an ISO (idealy) or a running system (mixed results).

Hub bug: default AWS locale:
Guess it must be a Hub bug then. Now you mention I recall the discussion around that feature when you implemented it and I had forgotten all about it. It defaults to East Coast US (which is almost exactly on the other side of the world to me - I would imagine that West Coast US would be closer but SE Asia would be the closest and most preferable IMO. I guess it depends on the piplelines though. Still I think that the connection to W US would still be better than E US. AFAIK there are 2 big fat pipes from Nth Au to SE Asia and 2 more E Au to W US and only one W AU to E US. See what you reackon, I'm more than happy to provide whatever extra info that you would like to assist.
 
[edit - fixed some typos and organised my thoughts a little more clearly]
Alon Swartz's picture

more comments...

TKLBAM email: nice ideas, but needs some thought regarding user-experience and implementation. Added to my todo list.

TKL logging server: Can't wait to see what Chris comes up with...

Hub auto-location: For reference, this is the first blog post on the issue. Looking at the auto-generated map, AU should be associated with AP-SOUTHEAST. Could you take a look at your backups location? Also, could you try running the following and let me know the results (don't worry, it won't change anything, just output the closest archives per your IP address).

auto-apt-archive ubuntu
auto-apt-archive debian

Could you also drop me an email with your ipaddress, so I can perform some tests on the geo-location service.

Chris Musty's picture

When I find some time

I would like to do this but am restrained by the need to earn money. Probably in about 4-6 weeks I should be able to start something. I really can't find a good interface (free) so I will probably write my own, which unfortunately will add additional time to the project. Might start another thread for ideas and requests.

Chris Musty

Director

Specialised Technologies

Chris Musty's picture

Process and standards

I can read up on TKL patch and build on top of things like lamp but is there anything in particular that is not used with TKL? eg for the web interface is there a preference over php or cgi etc

Anything else anyone wants to add?

Chris Musty

Director

Specialised Technologies

Jeremy's picture

I can't speak for the core devs

But my take on it is that TKL appliances need to be as user-friendly as possible but also somewhat generic in their initial config (ie no specific hardware optimisation etc). Having said that if you were to say, tune Apache to run better with a particular web app then that would be cool.

I know the TKL guys are big Python fans, but Webmin uses Perl so if you wished to build on Webmin (a custom Webmin module?) then you would need to go the Perl route. Bottom line is that if you make something yourself, then do what works for you although don't be scared to ask for feedback early in your development, I'd especially recommend getting some from Alon and Liraz (seeing as they're the core devs).

As for software, ideally if it comes from the Ubuntu Lucid repos is preferred. If what you want isn't there, of the version is considered to be missing features, or is buggy then you can get it elsewhere. Official PPAs or alternative (Lucid) repos (eg one provided by the software maintainer) are quite good options. Otherwise installing from source is fine, although try to stick with stable software if you can. As you'd know installing from version controlled repos (eg SVN, git, etc) under heavy development can have unpredictable and unreliable results.

Look forward to see what you come up with! :)

Chris Musty's picture

Backup Notifications

Hi all,

As mentioned above it would be good to have some idea of what a backup is doing. I currently have an incremental backup underway but the only way I can tell is to look at the routers traffic logs. Is there a CLI command or similar I can use to tell what the server is doing? TKLHUB does not tell me an increment is in progress, the "screen" in proxmox does not tell me and I have no real idea of whats happening.

Chris Musty

Director

Specialised Technologies

Liraz Siri's picture

Interesting ideas...

Thanks for the suggestions I hadn't considered programming confconsole (what you call the proxmox "screen") to display whether a backup is currently in progress. Adding this functionality to the TurnKey Hub is also worth looking into...

Anyhow, if you've enabled the daily backup cron job, the only way you can currently tell that it is in progress is to use the "ps" command to give you a process listing.

Chris Musty's picture

Incremental backup

Still testing the lasrge file system backups and something strange happened.

In my log file I get the following;

 

################################
### Thu Jul  7 23:12:52 2011 ###
################################
Local and Remote metadata are synchronized, no sync needed.
Last full backup date: Fri Jun 24 10:56:41 2011
No extraneous files found, nothing deleted in cleanup.
Local and Remote metadata are synchronized, no sync needed.
Last full backup date: Fri Jun 24 10:56:41 2011
 
--------------[ Backup Statistics ]--------------
StartTime 1310019970.16 (Thu Jul  7 06:26:10 2011)
EndTime 1310078960.32 (Thu Jul  7 22:49:20 2011)
ElapsedTime 58990.16 (16 hours 23 minutes 10.16 seconds)
SourceFiles 85291
SourceFileSize 114024646758 (106 GB)
NewFiles 147
NewFileSize 6977227819 (6.50 GB)
DeletedFiles 8
ChangedFiles 41
ChangedFileSize 13142493438 (12.2 GB)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 196
RawDeltaSize 7031690079 (6.55 GB)
TotalDestinationSizeChange 5856554933 (5.45 GB)
Errors 0
-------------------------------------------------
 
It appears to have backed up 6Gb or so but the incremental backup does not appear in TKLBAM?

Chris Musty

Director

Specialised Technologies

Liraz Siri's picture

Is this still a problem?

Did you figure this out with Alon yet? Regardless of what the TurnKey Hub says if Duplicity says it uploaded the incremental delta, it is sitting there in S3. The Hub just provides a user interface. It doesn't really matter what it thinks. I'm guessing the Hub code that is supposed to update the list of backup sessions may have failed somehow.
Chris Musty's picture

Problem Solved

Alon was right onto it and the problem seems to be gone now - the hub displays what has been backed up. For lengthy backups it would be nice to get some indication of where it is at though.

Chris Musty

Director

Specialised Technologies

L. Arnold's picture

Hub and TKLBAM could use the following...

I Love TKLBAM!  Unfortunately I use it enough that I would love to see a few tweaks.  No criticism intended (isn't there a spell checker here in the edit box - I had criticizm just a minute ago?)

  1. A: It would be very helpful if "old backups" could be deleted within the HubConsole, while maintaing the current "full backup cycle".
  2. B: I have mentioned before but I would love to better be able to annotate each backup set and each back up session.
  3. A "save" button to turn Daily Backups on and Off  (may be in one of the more recent builds)
  4. The ability to "compress" Backup Sets to Full Backups without going through Restore Process.
  5. The "clear ability" to delete a backup set without harming the ability backup the ap in the future (I believe I have done this before but don't recall If I just deleted it in the hub or ran a series of "apt-get remove --purge tklbam" and "apt-get install tklbam" commands to get this to happen.
  6. The ability to toggle between "active backup" and "archive status"...

Understandably many of these points are interrelated.

Supporting notes for current backup issue #1:

1:  I happen to have a Domain Controller that I made the mistake of backing up when I was using it to Backup a big batch of Archive Photos.  Now I have abould 30gb in storage and it looks like about 3 cycles of full backups and incremental backups took place on this server  --  It is costing me 2/3 of my monthly Amazon Storage Cost.

I will likely just Delete the whole backup on Amazon but I would like to retain the Domain Controller setup.  Probably safer to delete the photos, run one more Backup, then run a Restore and move on.  I don't want to risk moving 30gb over the network though (as that potentially has its own cost). 

I would rather remove then install tklbam.. Not sure if it will pickup where it left off or start a new backup.  Simply deleting the Big Backup makes me nervous in case I also break the ability to backup the ap itself.  (Caution is often unwarranted, though hard to measure if it is a waste of time)

Liraz Siri's picture

Good feedback Arnold!

Many thanks for taking the time to flesh out your thoughts on how we can improve TKLBAM.

Items 1 and 2 are easy to implement technically, we just have to think a bit more about the implications and how to get the user experience right.

Regarding backup deletion, I'm assuming you mean you want the ability to delete a particular backup session?

If you accidentally backed up 30GB, why not just delete the backup record and start over from scratch? Do you mean that you want the ability to undo the last backup so that you don't have to start over?

In other words, if you noticed you accidentally backed up too many files, you could add them to the overrides list and then delete the bad backup sessions on the Hub. Is that it?

Item 3 I don't fully understand. I think you can turn the cron job off and on in the Webmin module. Is that what you mean?

Item 4 is technically difficult to implement and would take significant computing resources on our end, so that will probably have to wait until we get other higher priority items out of the way.

Item 5 is already supported if I understand your meaning correctly. If you backup an appliance then delete the backup record on the Hub the next backup will create a new backup record. That seems to be equivalent to clearing the backup?

Finally, regarding item 6 what would you envision happening once you toggle a backup to "archive" mode and what problem would this solve?

Post new comment

The content of this field is kept private and will not be shown publicly. If you have a Gravatar account, used to display your avatar.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <p> <span> <div> <h1> <h2> <h3> <h4> <h5> <h6> <img> <map> <area> <hr> <br> <br /> <ul> <ol> <li> <dl> <dt> <dd> <table> <tr> <td> <em> <b> <u> <i> <strong> <font> <del> <ins> <sub> <sup> <quote> <blockquote> <pre> <address> <code> <cite> <strike> <caption>

More information about formatting options

Leave this field empty. It's part of a security mechanism.
(Dear spammers: moderators are notified of all new posts. Spam is deleted immediately)