You are here
We have a Redmin appliance that has been running for a while. Linux version 10.04.01 and webmin version 1.520. We use it to share architectural bid documents with the contractors bidding on our projects so our usage level varies from week to week but is never extremely high.
I back it up weekly. March 19 (about 3 weeks ago) ago I did the backup (it was a full back up since my last incremental was too old.) then ran updates. I had not run updates for a very long while and figured it would be good to get all the latest security patches etc. When the update was done, the client site was not working. Since I was running out of time and needed this to be available I restored from the backup I had done just before running the updates. The restore worked fine and everything was working again. However after the restore the webmin reported that the amount of space used just about doubled which has just about filled the hard drive (only 4.96 GB free).
Two weeks ago I tried to run backup again. It got most of the way through the full backup but it seemed like it hung. I just left it running over the weekend but it never finished cleanly. When I tried to run the backup again, it reported that there was a backup in progress and would not run. So this last weekend I rebooted the machine and tried running the backup again. Same problem with the backup getting very close to done then hanging and erroring out. Again it is saying that I can't run a backup since there is a previous backup in progress. We have bids underway wtith additional documents loaded and several additional users registered since the last successfull backup. So I really want to get a good backup before I start digging into what doubled the used stoage space and/or migrating to a clean new install.
Where in the logs locally could I find the specific error that is blocking the backup from finishing? Unfortunately I did not write down the specifics before closing out the browser. I'm assuming that is the right place to start troubleshooting. Or is there a safe way to clear up some space on the hard drive without a verified current backup?
Any ideas or suggestions on what is wrong and how I can get it fixed are really appreciated. I'm not extremely proficient with Linux so the more detail the better. Thanks for any help. Pat
Still can't get backup to finish
Still have the problem - whenever I try to run a backup, I get "back up in progress" error. I'll try rebooting the server this weekend when no users are on it and will try running a backup again to see if it will finish this time.
If anyone has any suggestions or adjustments that I should make before restarting the backup, please let me know. I appreciate any help. Thanks Pat
Pat
Back up in progress error
I rebooted the machine and reran backup today. It errored out before finishing again. It looks like it stopped about the same place as last time. This time I printed out the screen before closing it. Here is the end with everything before it cut out. Hope someone can look at this and tell me how to fixi it.
Any help is appreciated. Thanks Pat
Pat
Deep apologies on my lack of response...
FWIW I've been flat out trying to get v14.1 out the door!
When you say it's been running for a while; you're not joking! 4-5 years I'd guess! :) v10.04.01 is actually the Ubuntu version that it was based on. It would be a TurnKey v11.x appliance. We've since moved to a Debian base (in v12.0 - Debian Squeeze). If this server is internet facing then I strongly suggest that you plan to migrate your data to a newer server as Ubuntu stopped providing security updates for v10.04 almost a year ago...
WRT your issue, do you have plenty of free HDD space? It appears to be an I/O error and the most common cause for that is a full HDD. Also IIRC the earlier releases of v11.x had a couple of bugs. The first was that Webmin was saving stuff in /etc that shouldn't really be there (some of that you probably don't want in your backups like bulk logs etc). That issue was compounded by etckeeper (which is producing all those etc/.git/objects entries) not running it's garbage collection so the git repo that saves all your config changes (and all the cruft from Webmin) grows exponentially over time. IIRC I had a very basic LAMP server which over the space of a year ended up with over 500MB of rubbish. Assuming that you server is one that was affected by this bug I'd hate to imagine what level of cruft it has built up. Out of interest you can see how big your /etc dir is with this:
Or to just see how big the git repo that etckeeper stores everything is:If free space isn't your issue then I'm not really sure TBH. Technically we only support the previous major version (i.e. we're on v14.1 now so support only goes back to v13.0) but we hate to see customers stranded so will do what we can to help you out..
Disk is pretty full
I do really appreciate you taking the time to respond and for confirming my growing suspicions that the lack of free space on the drive might be the problem. Also thanks for all the work you do providing Turnkey Linux to us.
The disk is pretty full. When I did the restore from a back up the disk space that was used just about doubled. When I ran the commands you listed above I did find the /etc/.git was were large - 24G out of a total drive size of 63G. I'm trying to get a good current back up then plan to wipe out the old version and start over with the most current Turnkey Redmine appliance.
The hope is that I can import my data/users/and the like from the old Redmine databases. I've Redmine migration notes so know that I may have a couple of manual adjustments but think I can make it work OK. Since I plan to load the new Redmine appliance on a different machine I'll have the opportunity to work on it without impacting the current use.
Is there a best practice way to clean out the /etc/,git/objects? Shall I just go in an manually remove? Thanks again for the assistance.
Pat
Run a git garbage collect on your etc directory
Otherwise you can reinitialise it completely. You will lose the history of all your config changes but assuming you haven't changed anything lately and everything is working ok it should be fine. Do that like this:
Without manually removing files from /etc (which IMO is generally a really bad idea...) that's probably as good as you'll get...restore cache
You write you already tried to do a restore, and it doubled disk usage. I think tklbam-restore downloads and unpacks the data on local disk. It will be cached somewhere. Look up documentation (or wait for Jeremy) to find out how to clean up cached data.
Yep you need plenty of space to restore a backup
TBH I don't recall where TKLBAM stored the cache back in v11.x as I was just an enthusiastic volunteer back then! :) But if you 're still having issue I can find out.
Cleared a LOT of space
When I ran "git gc" it found 939 objects but errored out saying "bus errorobjects 939". Since I have not updated or changed anything recently and the server is working OK, I went ahead and ran the uninit / init commands. That cleared out a lot - etc/.git went from using 24G down to using 33M. I'll run a full back up now and will post back if it finishes OK.
I could track down where the restore is cached but if I can get a clean new back up, then I'll just move forward with starting over with the newest Redmine appliance.
Thanks much for both of your help.
Pat
SOLVED - back up completed OK
Just to update. The back up ran and completed OK. Now I'll move on to starting up a clean up to date Redmine appliance. Thanks again for your assitance with this.
Pat
Add new comment