Jeremy Davis's picture

Hi all,

Background (skip this bit if you want)
As many of you would be aware, TKL appliances include etckeeper (configured to use git) by default. That's all well and good, but I have a little bit of an issue... On an old Tracks appliance that I run at home to try to keep myself productive and keep track of what I need to do and what I've done. Overall I'm very happy with it. I originally just set it up as a test server (over a year ago now - running under KVM). Seeing as I've been using it (and found a neat Android app that syncs with it called Shuffle) I decided that I would now run it under OVZ (much more efficient virtualisation) and keep it. I've converted the ISO to an OVZ and about to set it up. I figured the easiest way to transfer all my stuff across would be via TKLBAM. All good to this point. Out of curiosity I ran a TKLBAM simulation and to my surprise found the backup would include 550+MB!?!? WTF!?! I know I've been busy this last year but that seems ridiculous! So after some investigation with ncdu (thanks Liraz for this awesome tool!) I discovered that most of the backup comes from /etc/.git this is where etckeeper stores all it's info. So I decided that I could get rid of that... Enter my problem...

The Problem:
So the /etc/.git folder (etckeeper's git repo) contains ~550MB of data. How to cull this down a bit? I have tried a number of different approaches (including but not limited to apt-get remove --purge etckeeper && apt-get install etckeeper) but as soon as I reinitialise etckeeper (etckeeper init) most of it comes back!?! (less about 5MB that seems to have dissapeared for good).

I have searched high and low online, both for etckeeper specifically and also for git in general and can't see an easy way to make the data gone (it just comes back as soon as etckeeper is reinitialised). Does anybody have any pointers for me?? The only permanent solution seems to be to not initialise etckeeper (ie not use it). I'd rather not do that (because I like having all my config settings backed up) but for now that is what I'm going to do.

IMO there needs to be an easy way to purge the etckeeper history and there just doesn't seem to be one. If I can't find an easy answer soon, I think I'll log this as a TKL bug (and an etckeeper one too).

Forum: 
Jeremy Davis's picture

How easy was that?!?

I couldn't actually test on my old Tracks appliance (with the 550MB /etc/.git) because I uninstalled (and purged) etckeeper and uninstalled (and purged) git then reinstalled them both which bought it down to about 9MB.

But I just ran it on another instance that had crept up to ~55MB in /etc/.git aand it's now down to less than 7MB. So it works a treat.

Thanks again :)

Alon Swartz's picture

Jeremiah emailed us a tip on etckeeper which we've already implemented for upcoming appliance builds, but is obviously useful for current deployments as well.

/etc/etckeeper/post-install.d/99git-gc

#!/bin/sh
# run git garbage collection after each apt run (Thank Jeremiah!)
exec git gc
Jeremy Davis's picture

The etckeeper man page is pretty sparse in this regard (not enough info IMO). I thought that there would've been some sort of option to do this easily somewhere and I got lost in the git documentation (info overload!) so I couldn't win!

BTW In case you didn't notice the 'bugs' I lodged, I'm pretty sure that most of the rubbish that had collected was from Webmin. It seems to store history in /etc/webmin which is pretty poor form IMO.

Scott's picture

I know this is an old thread .. but

>cd /etc

> git gc 

This cleaned up 3.5GB ! from a git repository.

Glad you guys post these things - thanks.

 

Liraz Siri's picture

I'm thinking maybe we should add a weekly cron job that garbage collects etckeeper so that it doesn't get out of hand.

I added an issue to the bug tracker:

https://github.com/turnkeylinux/tracker/issues/256

Add new comment