RobboRob's picture

Unfortunately today I needed the tklbam restore functionality and even more unfortunate it fails during restore producing the following errors:

Last full backup date: Fri Sep 6 14:01:35 2013 
Download s3://s3.amazonaws.com/tklbam-xrbucjpil7kx55oh/duplicity-inc.20130912T125001Z.to.20130912T172613Z.vol159.difftar.gpg failed (attempt #1, reason: BotoServerError: BotoServerError: 500 Internal Server Error 
<?xml version="1.0" encoding="UTF-8"?> 
<Error><Code>InternalError</Code><Message>We encountered an internal error. Please try again.</Message><RequestId>E2B7123E85648962</RequestId><HostId>N/Hv+hk+MjCBNFoX9sqKiZ61GCKBT/F+zDqd6r4pX4TI8/OdZ/M8yF+FmKKIifxq</HostId></Error>) 
Download s3://s3.amazonaws.com/tklbam-xrbucjpil7kx55oh/duplicity-inc.20130912T125001Z.to.20130912T172613Z.vol196.difftar.gpg failed (attempt #1, reason: BotoServerError: BotoServerError: 500 Internal Server Error 
<?xml version="1.0" encoding="UTF-8"?> 
<Error><Code>InternalError</Code><Message>We encountered an internal error. Please try again.</Message><RequestId>8F44919CEB335758</RequestId><HostId>nuGmj64Ll0gug04S7bFsSnTnCICkJOrIpIcsrctafwjL2yHcoH8FTlcmTSk7Hn6A</HostId></Error>) 
Download s3://s3.amazonaws.com/tklbam-xrbucjpil7kx55oh/duplicity-inc.20130912T125001Z.to.20130912T172613Z.vol245.difftar.gpg failed (attempt #1, reason: BotoServerError: BotoServerError: 500 Internal Server Error 
<?xml version="1.0" encoding="UTF-8"?> 
<Error><Code>InternalError</Code><Message>We encountered an internal error. Please try again.</Message><RequestId>50B5382B70742F69</RequestId><HostId>RqTzeVZEsQWkQGA1N3LDaH2XhFJ81mZpK+t55X3lRdmdC1loZk4AoPlHq4PmFOqy</HostId></Error>) 
Download s3://s3.amazonaws.com/tklbam-xrbucjpil7kx55oh/duplicity-inc.20130914T094152Z.to.20130915T094117Z.vol3.difftar.gpg failed (attempt #1, reason: timeout: timed out) 
Download s3://s3.amazonaws.com/tklbam-xrbucjpil7kx55oh/duplicity-inc.20130914T094152Z.to.20130915T094117Z.vol3.difftar.gpg failed (attempt #2, reason: IncompleteRead: IncompleteRead(0 bytes read, 6124 more expected)) 
Download s3://s3.amazonaws.com/tklbam-xrbucjpil7kx55oh/duplicity-inc.20130914T094152Z.to.20130915T094117Z.vol3.difftar.gpg failed (attempt #3, reason: IncompleteRead: IncompleteRead(0 bytes read, 6124 more expected)) 
Download s3://s3.amazonaws.com/tklbam-xrbucjpil7kx55oh/duplicity-inc.20130914T094152Z.to.20130915T094117Z.vol3.difftar.gpg failed (attempt #4, reason: IncompleteRead: IncompleteRead(0 bytes read, 6124 more expected)) 
Download s3://s3.amazonaws.com/tklbam-xrbucjpil7kx55oh/duplicity-inc.20130914T094152Z.to.20130915T094117Z.vol3.difftar.gpg failed (attempt #5, reason: IncompleteRead: IncompleteRead(0 bytes read, 6124 more expected)) 
Giving up trying to download s3://s3.amazonaws.com/tklbam-xrbucjpil7kx55oh/duplicity-inc.20130914T094152Z.to.20130915T094117Z.vol3.difftar.gpg after 5 attempts 
BackendException: Error downloading s3://s3.amazonaws.com/tklbam-xrbucjpil7kx55oh/duplicity-inc.20130914T094152Z.to.20130915T094117Z.vol3.difftar.gpg 

The restore then stops.  What are my options?

___

Progress, still getting the errors, each time on different files (sometimes files are readed correctly and then others are read with errors).  I'm using the --force option now so hopefully the restore will at least restore as much as possible...

Forum: 
Jeremy Davis's picture

TBH I have no idea what is going on here and why you are having these errors... There seem to have been a few people experiencing issues with backups (although I haven't) but you are the only poster (that I have seen) having issues with restore... 

The only thing I could suggest is that you try contacting the devs direct via the Hub's Feedback (blue button on left hand side when logged in).

I did a 22G restore and had a number of these BotoServerError but it completed OK.

Just out of interest do you have multiple routes to the internet or load ballancing? That idea crossed my mind as we run different connection types and I watched the traffic flipping between WANs as other users fired up services during the restore process.

Stu


RobboRob's picture

@StuC: No multiple routes, no load balancers.

I finally gave up after 3 days of restoring without success.  Almost 8 out of 10 times reading the files failed (often after hours of waiting). When it didn't it stopped during the restore phase (very early in this stage). I've been watching the restore closely with top, df and du during the process to ensure no filesystem was overfilling and if the expected processes were still running...

The only things I doubt on which might have had an effect on the backup created and therefor the result of the restore:

1) during this cycle of full backup and incrementals the location of the database files was changed from /var/lib/mysql to /mnt/var/lib/mysql (due to space restrictions)

2) the location of all the webserver and database data was on /mnt/var (and in sub folders), which caused the reason for restore (since when my server was upgraded from small to a medium sized, CPU optimed sized server at AWS this whole volume is replaced, and no longer contains the data previously on this volume)...

3) total size of the backup was 30GB with a full backup made at Sept. 6 and incrementals up to Sept. 23 which might be too much to deal with by the TKLBAM software???

All and all I'm very disappointed in this solution, I've lost app. 9 days of production data (luckily I had some export of the most important databases from the 14th still available)... I wrote my own backup scripts to secure my data for the future...

It's easy for me to be wise after the fact but the biggest issue would appear to be that incremental backups across file system rework. I'd not trust that on local tape.

Full backup, change layout, new full backup on another tape (NEW LAYOUT) to me.

Not sure its fair to blame TKLBAM but then I have not use many fancy incremental solutions so this may be everyday fare to others.

I might change the frequency of full backups on servers (small scale) I look after as its is quite possible I could be opening myself to issues I don't need.


Question;

Could you restore the server (to a virtual box locally) to just before the layout change, then make the layout change and try the next incremental? if that works make as many backups as physical reality allows - delete the TKLBAM backups and backup/restore to the online service?


RobboRob's picture

Question;

Could you restore the server (to a virtual box locally) to just before the layout change, then make the layout change and try the next incremental? if that works make as many backups as physical reality allows - delete the TKLBAM backups and backup/restore to the online service?

I'm not sure when the change was made, so without a lot of itterating long during trials this is going to be diffficult.  After three days of restoring attempts I gave up and rebuild from a previous export...

Liraz Siri's picture

The new version of TKLBAM should make mixing and matching TKLBAM with whatever local backup setup you want much easier. Instead of forcing you to go through Duplicity and make "fancy incremental backups" you can ask TKLBAM to just dump the system backup data to a plain old directory and let you do whatever you want with it. It also supports restoring from a directory.

Example use case:


# on system A
tklbam-backup --dump=/tmp/mybackup
tar jcvf mybackup.tar.bz2 /tmp/mybackup

# on system B
cd /tmp
tar jxvf mybackup.tar.bz2
tklbam-restore /tmp/mybackup

FWIW, TKLBAM v1.3 is the new version in TurnKey 13 RC3 package archive. I'm working on backporting it to TurnKey 12 right now.

That sounds a great option to have.

Where net connections are slow/busy the ability to manipulate local backups in such a simple form should make testing or secondary backups so much quicker. I was a reluctant convert to cloud backup and bandwidth used during restore checks or changes presented a bit of a log jam for us UK users where significant parts of the country are still on slow ADSL.

The flexibility and just plain can-do of this system grows daily, I'd be genuinely lost without the ability to set up a particular server in minutes now.

I've had conversations about a possible requirement and in the time is takes to describe the need I've had a VM downloaded and ready for test. Awesome service.


Jeremy Davis's picture

TKLBAM accepts the --raw-download=path/to/backup/ switch which should do what you are asking (I think?). Source

[update] Sorry I just reread a little of this thread and I think I understand what you are saying... You actually want to download direct from the S3 bucket (without using TKLBAM at all? TBH I have no idea of how you would go about that...

Jeremy Davis's picture

Can't connect errors won't be avoided but perhaps using the --limits= switch you could just download the DB. At least then you might have a better chance of it downloading the bit you really need (assuming the DB is the bits that is most important). You could also try that to download specific parts of the FS if need be.

Have you double checked your networking/internet to make 100% sure nothing is going weird there?

Another TKLBAM switch that might be of use is the --simulate switch. Perhaps that might give you a better idea of what/where it is actually going wrong?

Also even though at the commandline it is showing an error, it might be worth checking the error log (IIRC /var/log/tklbam) as perhaps that has some additional info?

FWIW TKLBAM is currently a free service. But by default it is configured to use Amazon S3 storage (which you pay for). TurnKey Linux do not get a cut of that (actually it costs TurnKey to do it because of the way Amazon billing works), that is purely the Amazon cost of storage. It's a bit like giving you a car but you still need to pay for fuel...!

Additionally you have the option of storing you backups locally (or on a multitude of other options - so long as you can configure it - this is documented). Then you have to take responsibility for any storage costs associated directly yourself (rather than it being done via Amazon DevPay)

That's not to fob you off, as ideally TKLBAM should just work and it should be supported. On the flipside though it is free open source software which you are free to adjust (or pay someone to adjust) if you so desire. The code is available via GitHub.

Add new comment