Re: [RFC PATCH] index-pack: improve performance on NFS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 08 2018, Jeff King wrote:

> On Wed, Nov 07, 2018 at 10:55:24PM +0000, Geert Jansen wrote:
>
>> On Mon, Oct 29, 2018 at 07:27:39PM -0400, Jeff King wrote:
>>
>> > On Mon, Oct 29, 2018 at 08:36:07PM +0100, Ævar Arnfjörð Bjarmason wrote:
>> > >  * Re-roll my 4 patch series to include the patch you have in
>> > >    <20181027093300.GA23974@xxxxxxxxxxxxxxxxxxxxx>
>> >
>> > I don't think it's quite ready for inclusion as-is. I hope to brush it
>> > up a bit, but I have quite a backlog of stuff to review, as well.
>>
>> We're still quite keen to get this patch included. Is there anything I can do
>> to help?
>
> Yes, testing and review. :)
>
> I won't send the series out just yet, as I suspect it could use another
> read-through on my part. But if you want to peek at it or try some
> timings, it's available at:
>
>   https://github.com/peff/git jk/loose-cache

Just a comment on this from the series:

    Note that it is possible for this to actually be _slower_. We'll do a
    full readdir() to fill the cache, so if you have a very large number of
    loose objects and a very small number of lookups, that readdir() may end
    up more expensive.

    In practice, though, having a large number of loose objects is already a
    performance problem, which should be fixed by repacking or pruning via
    git-gc. So on balance, this should be a good tradeoff.

Our biggest repo has a very large number of loose objects at any given
time, but the vast majority of these are because gc *is* happening very
frequently and the default expiry policy of 2wks is in effect.

Having a large number of loose objects is not per-se a performance
problem.

It's a problem if you end up "faulting" to from packs to the loose
object directory a lot because those objects are still reachable, but if
they're not reachable that number can grow very large if your ref churn
is large (so lots of expired loose object production).

Anyway, the series per-se looks good to me. It's particularly nice to
have some of the ODB cleanup + cleanup in fetch-pack.c

Just wanted to note that in our default (reasonable) config we do
produce scenarios where this change can still be somewhat pathological,
so I'm still interested in disabling it entirely given the
implausibility of what it's guarding against.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux