Re: [PATCH] git exproll: steps to tackle gc aggression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano wrote:
>> 3. After a few days of working, the gc heuristics figure out that I
>> have too much garbage and too many packs; a cleanup is required. The
>> gc --auto which doesn't tolerate fragmentation: it tries to put
>> everything into one large pack.
>
> Today's "gc --auto" skims the history shallowly and packs loose
> objects into a single _new_ pack.  If you start from a single pack
> and enough loose objects and run "gc --auto", you will have two
> packs, one intact from the original, one new.

That is when I have lots of loose objects and few packs. We are
discussing gc aggression when there are too many packs (how did we get
there in the first place if new packs aren't created?): in which case
it doesn't tolerate fragmentation, and tries to put everything in one
pack. That is both IO and CPU intensive, and we've been trying to
tackle that since the start of the thread.

> And I see from "(or when ...)" part that you do not know what you
> are talking about.  "push" will not stream existing pack to the
> other end without computation, and it never will.  It will reuse
> existing individual pieces when it can, and having data in a pack
> (even without deltifying, or as a suboptimal delta) is much better
> for push performance than having the same data in a loose object if
> only for this reason, as you do not have to recompress and reencode
> it in a different way (loose objects and undelitifed object in pack
> are encoded differently).

Certainly. A push will never use an existing pack as-is, as it's very
highly unlikely that the server requested exactly what gc --auto
packed for us locally.

Sure, undeltified objects in the pack are probably better for push
performance, but I'm talking about the majority: deltified objects.
Don't you need to apply the deltas (ie. explode the pack), before you
can recompute the deltas for the information you're sending across for
the push?

> I've treated this thread as a design discussion, not an education
> session, but the above ended up having more education material than
> anything that would advance the design.

You will need to educate your contributors and potential contributors
if you want these problems to be fixed.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]