On Wed, Sep 05, 2007 at 07:09:27AM +0000, Linus Torvalds wrote: > I've been against automatic repacking, but that was really based on what > appears to be potentially a very wrong assumption, namely that people > would do the manual repack on their own. If it turns out that people don't > do it, maybe the right thing for git to do really is to at least notify > people when they have way too many pack-files and/or loose objects. Well independently from the fact that one could suppose that users should use gc on their own, the big nasty problem with repacking is that it's really slow. And I just can't imagine git that I use to commit blazingly fast, will then be unavailable for a very long time (repacks on my projects -- that are not as big as the kernel but still -- usually take more than 10 to 20 seconds each). > I personally repack everything way more often than is necessary, and I had > kind of assumed that people did it that way, but I was apparently wrong. > Comments? I do, when I'm bored and that I can't get things done. you know, it has become one of my many twitches when I have an empty tty in front of me and that I'm doing nothing useful. Though, when I'm in a hack-attack, well I don't necessarily remember to repack. I'm in one of the (not so many ?) very lucky companies (yay start-ups) where I could show that git was very superior, and we now use it as our sole SCM. So when I'm in a hack attack, it's usually that it's a busy week, and that new patches, trees, objects (and sometimes with large binary things in it) flows like hell. And the repository grows larger and larger. Well, the way we chose to avoid the "I'm coding don't bother me with administrivia"-attitude is that our users use a small cron that basically runs git gc each day, and an aggressive repack (with a window of 50 or 100 I don't remember) each Week-end in a cron. Because the best criterion to repack a repository is: when there is no-one on the computer. It has proven quite good, as we have never seen a repository explode in a day, even after some funny mistakes where people rebase some big parts of the tree many times, generating very large number of loose objets. I know I don't really answer the question, but the point I try to make is that yeah, some kind of automated way to run the gc is great, but I'm not sure that _git_ is the tool to automate that, because when *I* use git, I expect it to be just plain fast, and I don't want it to occasionally hang. -- ·O· Pierre Habouzit ··O madcoder@xxxxxxxxxx OOO http://www.madism.org
Attachment:
pgp7fEGfZwNSK.pgp
Description: PGP signature