Re: Git in Outreachy?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff King <peff@xxxxxxxx> 于2021年9月4日周六 下午8:50写道:
>
> On Sat, Sep 04, 2021 at 03:40:41PM +0800, ZheNing Hu wrote:
>
> > This may be a place to promote my patches: See [1][2][3].
> > It can provide some extra atoms for git cat-file --batch | --batch-check,
> > like %(tree), %(author), %(tagger) etc. Although some performance
> > optimizations have been made, It still has small performance gap.
> >
> > If the community still expects git cat-file --batch to reuse the logic
> > of ref-filter,
> > I expect it to get the attention of reviewers.
> >
> > The solutions I can think of to further optimize performance are:
> > 1. Delay the evaluation of some ref-filter intermediate data.
> > 2. Let ref-filter code reentrant and can be called in multi-threaded  to take
> > advantage of multi-core.
>
> I don't think trying to thread it will help much. For expensive formats,
> where we have to actually open and parse objects, in theory we could do
> that in parallel. But most of our time there is spent in zlib getting
> the object data, and that all needs to be done under a big lock.
>

This big lock is "obj_read_lock()", right? If there are indeed the limitations
of these locks, I am afraid that the parallel scheme is not good.

> For little formats (e.g., just printing "%(refname)"), we need to
> serialize the output anyway. So our unit of work is so tiny, I suspect
> that the threading overhead would be a net negative.
>

Make sence.

> I was coincidentally looking at ref-filter last week, and it seemed to
> me that a lot of the slowness is because of the over-use of malloc

Agree. malloc() and data-copy is the reason for the poor performance of
ref-filter.

> (e.g., we allocate a substring for every atom_value, and then form them
> into a separate buffer). If we could parse the original format into a
> form that could be traversed without having to do further allocations,
> just writing directly to a strbuf (or even a file handle), I think that
> would be a big improvement.
>

This patch has been tried to eliminate some malloc and data-copy:
https://lore.kernel.org/git/3760ff032bb1dec3812881fd408f8d78ec125477.1629184489.git.gitgitgadget@xxxxxxxxx/
It is indeed possible to obtain some optimizations.

> I just posted the results of some of my experiments to the list:
>
>   https://lore.kernel.org/git/YTNpQ7Od1U%2F5i0R7@xxxxxxxxxxxxxxxxxxxxxxx/
>
> I don't think that gives any kind of useful base to build on, but it
> shows what's possible by skipping past various segments of the
> ref-filter code.
>
> -Peff

Thanks.
--
ZheNing Hu




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux