Hi Andy, On Sat, Feb 20, 2010 at 5:37 AM, Andrew Benton <b3nton@xxxxxxxxx> wrote: > I have a project that I store in a git repository. It's a bunch of source > tarballs and some bash scripts to compile it all. Git makes it easy to > distribute any changes I make across the computers I run. The problem I have > is that over time the repository gets ever larger. When I update to a newer > version of something I git rm the old tarball but git still keeps a copy and > the folder grows ever larger. At the moment the only solution I have is to > periodically rm -rf .git and start again. This works but is less than ideal > because I lose all the history for my build scripts. > > What I would like is to be able to tell git to not keep a copy of anything > that has been git rm. The build scripts never get removed, only altered so > their history would be preserved. Is it possible to make git delete its backup > copies of removed files? I don't know if I can really speak to your hoped for conclusion although `git filter-branch` is where you want to look for rewriting history. However, that's also an entirely impractical solution if your repo is at all public because it would completely break sharing. That being said, have you thought of changing your repo strategy? IMHO, storing binary blobs that change at all regularly in _any_ SCMS is a problem waiting to happen. It's different if you have assets that are fairly stable like images for a system's UI or dependencies that have been stabilized, but that doesn't sound like your situation. As a thought, why not try to do something along the lines of maintaining a symlink to whatever tarballs your project currently depends on as a 'foolib-latest' and then having a separate directory that has a file that you can change. You could maintain backups of that using a tool like rsync (since you obviously aren't concerned with maintaining history there) rather than git. Then you could decide arbitrarily how many backups you want to make and try to maintain what version of the file went with which commit in your repo. The main problem I see with that is that you loose a lot of the advantages of having a SCMS because you can't reliably checkout a previous commit and build it; at least not without some very serious effort. Another possible solution if you maintain the sources that are generating the tarballs is to treat the tarballs as artifacts of the build rather than as assets that should be managed by the SCMS. In that way, you might spend more time during each build but your repo would be much cleaner and would have the added advantage of being able to completely build itself at every commit point. Anyway, just food for thought. -- In Christ, Timmy V. http://burningones.com/ http://five.sentenc.es/ - Spend less time on e-mail -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html