Re: People unaware of the importance of "git gc"?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 05, 2007 at 07:09:27AM +0000, Linus Torvalds wrote:
> I've been against automatic repacking, but that was really based on what 
> appears to be potentially a very wrong assumption, namely that people 
> would do the manual repack on their own. If it turns out that people don't 
> do it, maybe the right thing for git to do really is to at least notify 
> people when they have way too many pack-files and/or loose objects.

  Well independently from the fact that one could suppose that users
should use gc on their own, the big nasty problem with repacking is that
it's really slow. And I just can't imagine git that I use to commit
blazingly fast, will then be unavailable for a very long time (repacks
on my projects -- that are not as big as the kernel but still -- usually
take more than 10 to 20 seconds each).

> I personally repack everything way more often than is necessary, and I had 
> kind of assumed that people did it that way, but I was apparently wrong. 
> Comments?

  I do, when I'm bored and that I can't get things done. you know, it
has become one of my many twitches when I have an empty tty in front of
me and that I'm doing nothing useful. Though, when I'm in a hack-attack,
well I don't necessarily remember to repack. I'm in one of the (not so
many ?) very lucky companies (yay start-ups) where I could show that git
was very superior, and we now use it as our sole SCM. So when I'm in a
hack attack, it's usually that it's a busy week, and that new patches,
trees, objects (and sometimes with large binary things in it) flows like
hell. And the repository grows larger and larger. Well, the way we chose
to avoid the "I'm coding don't bother me with administrivia"-attitude is
that our users use a small cron that basically runs git gc each day, and
an aggressive repack (with a window of 50 or 100 I don't remember) each
Week-end in a cron. Because the best criterion to repack a repository
is: when there is no-one on the computer.

  It has proven quite good, as we have never seen a repository explode
in a day, even after some funny mistakes where people rebase some big
parts of the tree many times, generating very large number of loose
objets.


  I know I don't really answer the question, but the point I try to make
is that yeah, some kind of automated way to run the gc is great, but I'm
not sure that _git_ is the tool to automate that, because when *I* use
git, I expect it to be just plain fast, and I don't want it to
occasionally hang.

-- 
·O·  Pierre Habouzit
··O                                                madcoder@xxxxxxxxxx
OOO                                                http://www.madism.org

Attachment: pgp7fEGfZwNSK.pgp
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux