TurnKey Linux Virtual Appliance Library

TKLBAM - Seeding TKLHUB w/ initial full backup?

 

Hello,

I looked through the TKLBAM FAQ and did not see this covered -- so thought I would bring it up as a topic.  

The question / thought is this -- Does TKLBAM have any mechanism for seeding an initial full backup to the HUB?

The concept is this:  for a large volume of data (let's say 72GB as an example, all homed on a TKL File Server appliance), doing an initial full backup into the HUB over standard US broadband has the real potential and even likelihood of taking days if not weeks. (I make this statement based on both real-world experience as well as using basic transfer rate calcs.)

With DFS (MS Windows) and RSync/Rdiff (correct me if I am wrong on this statement?), a good-practice solution to overcome these bandwidth limitations is to initially seed the far-end host by actually physically placing the initial backup data by way of directly-attached storage (USB 2.0 HDD, SATA, tape, etc.)  Then, after that initial seeding is done, DFS/Rsync/Rdiff/etc (?) can then replicated just the deltas of file changes down to the bit level.  (I.e. you change a cell in an Excel file, only those bytes of changes are sent over the wire.)

One use I have for the HUB is for doing network backups for customers (as well as my own production servers.)  But the problem that I can't seem to overcome is that initial huge volume of data, and then to do 30 day full’s, I would face that problem on a monthly basis.  I would love to be able to have a solution where I can leverage the TKL appliances, and in this example the file server appliance specifically, to do away with local on-site back to media all together and do straight network backups into the HUB.

(So many benefits to this—disaster recovery and continuity, elimination of human element of fault, ability to turn up servers in the hub to restore a data or file or bring up servers in case of a on-site catastrophe, etc.  It’s also important that I mention, Amazon S3 has a service that allows you to ship in media where they will then physically load it into your buckets, thereby doing the initial seeding.)

In other words -- I would like to be able to take the human element out of backups (swapping tapes, taking offsite, replacing tape, monitoring and testing media, etc.) by writing all that data to the HUB -- but the only way to make that doable (as far as I can see) is to be able to seed the hub.

Does this capability exist, or is this something that would be worthwhile to have on the roadmap to be added?

Would love to start a discussion on the topic to see what others thoughts are.


Thanks in advance!

Jeremy's picture

AFAIK it is not currently possible

But is an interesting idea. If the facility exists to do it with S3 then theoretically it could perhaps be done with the Hub (as the Hub uses S3 by default - as you probably already know... ). COuple of sticking point I can think of though. Currently the S3 buckets aren't available via the AWS console, so not sure how you could tell them where to put your data. Also by default TKLBAM backups are encrypted. Not sure how you'd go about making the data in some form that TKLBAM could work out what is going on...

Post new comment

The content of this field is kept private and will not be shown publicly. If you have a Gravatar account, used to display your avatar.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <p> <span> <div> <h1> <h2> <h3> <h4> <h5> <h6> <img> <map> <area> <hr> <br> <br /> <ul> <ol> <li> <dl> <dt> <dd> <table> <tr> <td> <em> <b> <u> <i> <strong> <font> <del> <ins> <sub> <sup> <quote> <blockquote> <pre> <address> <code> <cite> <strike> <caption>

More information about formatting options

Leave this field empty. It's part of a security mechanism.
(Dear spammers: moderators are notified of all new posts. Spam is deleted immediately)