Re: [PATCH 14/19] [GSOC] cat-file: reuse ref-filter logic

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 15, 2021 at 3:53 AM ZheNing Hu <adlternative@xxxxxxxxx> wrote:
>
> ZheNing Hu <adlternative@xxxxxxxxx> 于2021年7月15日周四 上午12:24写道:
> >
> > Junio C Hamano <gitster@xxxxxxxxx> 于2021年7月13日周二 上午4:38写道:

> > > I find it somewhat alarming if we are talking about "fast-path"
> > > workaround before understanding why we are seeing slowdown in the
> > > first place.
> >
> > There is no complete conclusion yet, but I try to use time and hyperfine test
> > for these commits (t/perf/* is not accurate enough):
> >
> > ----------------------------------------------------------------------------------------------------------------------------
> > |                        subject                                  |
> > --batch-check (using hyperfine) |   --batch(using time) |
> > ----------------------------------------------------------------------------------------------------------------------------
> > |[GSOC] cat-file: use fast path when using default_format         |
> >         700ms                |          25.450s      |
> > ----------------------------------------------------------------------------------------------------------------------------
> > |[GSOC] cat-file: re-implement --textconv, --filters options      |
> >         790ms                |          29.933s      |
> > ----------------------------------------------------------------------------------------------------------------------------
> > |[GSOC] cat-file: reuse err buf in batch_object_write()           |
> >         770ms                |          29.153s      |
> > ----------------------------------------------------------------------------------------------------------------------------
> > |[GSOC] cat-file: reuse ref-filter logic                          |
> >         780ms                |          29.412s      |
> > ----------------------------------------------------------------------------------------------------------------------------
> > |The third batch (upstream/master)                                |
> >         640ms                |          26.025s      |
> > ----------------------------------------------------------------------------------------------------------------------------
> >
> > I think we their cost is indeed from "[GSOC] cat-file: reuse ref-filter logic".
> > But what causes the loss of performance needs further analysis.
>
> Now I think:
> There are three main reasons why the performance of cat-file --batch
> deteriorates after refactor.
>
> 1. Too many copies are used in ref-filter and we cannot avoid these copies
> easily because ref-filter needs these copied data to implement atoms %(if),
> %(else), %(end)... and the --sort option. The original cat-file
> --batch only needs
> to output the data to the final string. Its copy times are relatively small.

Is it possible to check early if any of the atoms that needs these
copied data is specified, and if none of them is specified then to
avoid the copies?

> 2. More complex data structure and parsing process are used in ref-filter.
> This is why it can provide more and more useful atoms. Therefore, I think the
> performance degradation that occurs here is normal.

Are there way the more complex parsing could be avoided if it's not
needed by the atoms that are actually used?

> 3. As Ævar Arnfjörð Bjarmason mentioned, oid_object_info_extend() was used
> twice in get_object() before. oid_object_info_extend() is the hot
> path, we should
> try to avoid calling it, So in last version of  "[GSOC] cat-file:
> re-implement --textconv,
> --filters options", I make the unified processing of --textconv and
> --filter avoid calling
> oid_object_info_extend() twice.

Ok, thanks for the details and your work on this performance issue!

I wonder if your patch series could be split, so that the early parts
that add new atoms to ref-filter could be merged sooner?

Best,
Christian.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux