ddonovan's picture

I started a TKLBAM backup early this morning and it's apparently finished - TKLBAM itself in the Webmin module shows the new backup, my S3 account shows the right number of backup files as well as the signatures and manifest, and tklbam-status shows this:
TKLBAM:  Backup ID #4, Updated Thu 2018-08-16 04:10

.  BUT...

There's still a tklbam-backup process chewing up 75% of two cores, and the Hub doesn't show the backup yet despite my refreshes.

What is it still doing at this point, about 8 hours after the backup apparently finished?  Is there a progress monitor for TKLBAM?

Jeremy Davis's picture

I wonder if you're experiencing the bug that Bill Carney notes in this thread? It seems similar. Can you have a read through that and see if it matches your experience? Unfortunately there isn't anyway to (at least that I'm aware of) to monitor what a TKLBAM process initiated by cron is doing. You could check the log, although I'm not 100% sure if that will help. You may need to kill the current backup and manually run it from the commandline to see what it's up to.

As noted in that thread, Bill had migrated from an older version to v14.x. With a copy of Bill's DB, I could reproduce the issue and it seemed related to MySQL config (old config running on a newer server). However, the fix that worked for me (using his DB with the default Debian Jessie/v14.x mysql conf) didn't seem to resolve it for him. So we never did manage to exactly pin it down or resolve it unfortunately.

Regardless if it's the same bug or not, it sounds like for some reason, your backup isn't "finalising". So whilst it sounds like it has pretty much finished, for some reason it hasn't stopped and reported to the Hub that it's done. As Bill noted, if he manually kills the MySQL process, that essentially resolves it, but it's really only a workaround for a manual backup, no good for automatic ones...

ddonovan's picture

I don't think it's the same as Bill Carney's issue.  For one I'm not using the integrated MySQL database, I'm using an external MySQL server which isn't part of tklbam's backups.  And it's the tklbam-backup process that's hanging, not MySQL.

Jeremy Davis's picture

I forgot that you were using a separate MySQL server, so you are likely right. Although if TKLBAM is still trying to dump the DB (even if you aren't using it), unless you have totally removed MySQL from your server, it may still be the cause?

Perhaps it's worth just trying the service mysql restart that Bill notes allows his TKLBAM to complete? Just to see if it makes any difference?

If it does, unfortunately, I'm not 100% clear how the MySQL support works in TKLBAM, so I'm also not clear on the best way to disable it for MySQL. One way would be to tweak the cron job to include the --skip-database switch when running TKLBAM. AFAIK, it should also be possible to skip DB backup via the overrides file, but I've never actually tried to skip all DBs via that, so I'm not sure if just adding -mysql would work, or whether it'd need something more like -mysql:* or something else altogether...

Having said all that, you may well be correct and it's completely unrelated. But it seems like fairly low hanging fruit to check for... The only other options that occur to me would be doing some lower level python debugging. Perhaps the stacktrace you'd likely get from hitting <Ctrl><c> while it's stalled may be of some value?

Add new comment