Re: Hosting Git repositories: how useful will git-gc be?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 03, 2009 at 11:45:25AM +0200, Matthieu Moy wrote:

> A question: is it necessary/recommanded/useless to set up a cron job
> doing a "git gc" in each repository? My understanding is that a push
> through ssh will do some packing, is it correct? Does receiving a pack
> trigger a "git gc --auto"?

The objects are transferred as a pack. If the number of objects is less
than receive.unpackLimit (default 100), then they are unpacked to loose
objects. If more, we keep the pack, after completing any missing deltas
used by a thin pack.

So if you tend to push frequently, you will end up with a lot of loose
objects. Even if you have packs, they will be larger than necessary
because you will be missing deltas between objects across packs. And of
course you will eventually end up with a large number of packs, which is
less efficient (each pack has an index, but I believe we search the
indices linearly).

Receiving a pack does not (AFAICT looking at the code) trigger a "gc
--auto".  Running it has other benefits, too, like pruning cruft and
packing refs. So I think it is probably a good idea to run it
periodically.

Running it daily or weekly is probably reasonable. You could run it on
every push using the post-update hook, but that may cause excessive I/O
for very little benefit.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]