Re: auto gc again

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:

> On Wed, 19 Mar 2008, Junio C Hamano wrote:
>> 
>> Having said that, I am not sure how the auto gc is triggering for your
>> (presumably reasonably well maintained) repository that has only small
>> number of loose objects.  I haven't seen auto-gc annoyance myself (and
>> git.git is not the only project I have my git experience with), and Linus
>> also said he hasn't seen breakages.
>
> I think it was 'autopacklimit'.
>
> I think the correct solution is along the following lines:
>
>  - disable "git gc --auto" entirely when "gc.auto <= 0" (ie we don't even 
>    care about 'autopacklimit' unless automatic packing is on at all)
>
>    Rationale: I do think that if you set gc.auto to zero, you should 
>    expect git gc --auto to be disabled.

Sensible, I would say.

>  - make the default for autopacklimit rather higher (pick number at 
>    random: 50 instead of 20).
>
>    Rationale: the reason for "git gc --auto" wasn't to keep things 
>    perfectly packed, but to avoid the _really_ bad cases. The old default 
>    of 20 may be fine if you want to always keep the repo very tight, but 
>    that wasn't why "git gc --auto" was done, was it?

I do not think "very tight" was the reason, but on the other hand, my
personal feeling is that 20 was already 10 too many pack idx files we have
to walk linearly while looking for objects at runtime.

Each auto gc that sees too many loose objects will add a new packfile (we
do not do "repack -a" for obvious reasons) that would hopefully contain
6-7k objects, so you would need to generate 120-140k objects before
hitting the existing 20 limit.

And then auto gc will notice you have too many packs, and "repack -A" to
pack them down in a single new pack, and you are back to "single pack with
less than 6-7k loose objects" situation for the cycle to continue.

At least, that is the theory.

The kernel history with 87k commits have 720k objects, which roughly
translates to 8 objects per commit on average.  You would need to perform
13k commits to generate 100k new loose objects.  I am sensing that Jens is
mightily annoyed, rightfully so, by observing much shorter cycle than that
for "gc --auto" to kick in ("rev-list --author=Jens --since=8.month master"
tells me there are 145 commits in the last 8 months, far smaller than
13k).  So there is something else going on.

Perhaps fetching with dumb transports should run "gc --auto" (or even an
unconditional "repack -a -d") at the end?



--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux