Re: [PATCH] [GSOC] ref-filter: get rid of show_ref_array_item

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



René Scharfe. <l.s.r@xxxxxx> 于2021年4月17日周六 下午5:11写道:
>
> Am 09.04.21 um 15:35 schrieb ZheNing Hu via GitGitGadget:
> > From: ZheNing Hu <adlternative@xxxxxxxxx>
> >
> > When we use `git for-each-ref`, every ref will call
> > `show_ref_array_item()` and allocate its own final strbuf.
> > But we can reuse the final strbuf for each step ref's output.
> > Since `show_ref_array_item()` is not used in many places,
> > we can directly delete `show_ref_array_item()` and use the
> > same logic code to replace it. In this way, the caller can
> > clearly see how the loop work.
>
> Inlining an exported function that is not providing the right level of
> abstraction is a bold move that simplifies the API and can unlock
> improvements at the former call sites, like the possibility to reuse an
> allocated buffer in this case.  OK.
>
> > The performance for `git for-each-ref` on the Git repository
> > itself with performance testing tool `hyperfine` changes from
> > 23.7 ms ± 0.9 ms to 22.2 ms ± 1.0 ms.
>
> I see a speedup as well, but it's within the noise.
>

Yes, the performance improvement is very small under a large number
of refs. It was almost completely drowned out by the noise.

> > At the same time, we apply this optimization to `git tag -l`
> > and `git branch -l`, the `git branch -l` performance upgrade
> > from 5.8 ms ± 0.8 ms to 2.7 ms ± 0.2 ms and `git tag -l`
> > performance upgrade from 5.9 ms ± 0.4 ms to 5.4 ms ± 0.4 ms.
>
> On my system there's no measurable change with these commands.
>

In our case, git branch -l has made obvious progress, but it may be because
the number of branches is far less than tags.

> Nevertheless I think reusing the buffer across the loops is a good
> idea.
>
> > Since the number of tags in git.git is much more than branches,
> > so this shows that the optimization will be more obvious in
> > those repositories that contain a small number of objects.
> >
> > This approach is similar to the one used by 79ed0a5
> > (cat-file: use a single strbuf for all output, 2018-08-14)
> > to speed up the cat-file builtin.
> >
> > Signed-off-by: ZheNing Hu <adlternative@xxxxxxxxx>
> > ---
> >     [GSOC] ref-filter: get rid of show_ref_array_item
> >
> >     Now git for-each-ref can reuse final buf for all refs output, the
> >     performance is slightly improved, This optimization is also applied to
> >     git tag -l and git branch -l.
> >
> >     Thanks.
> >
> > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-928%2Fadlternative%2Fref-filter-reuse-buf-v1
> > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-928/adlternative/ref-filter-reuse-buf-v1
> > Pull-Request: https://github.com/gitgitgadget/git/pull/928
> >
> >  builtin/branch.c       |  8 ++++----
> >  builtin/for-each-ref.c | 13 +++++++++++--
> >  builtin/tag.c          | 13 +++++++++++--
> >  ref-filter.c           | 24 +++++++++---------------
> >  ref-filter.h           |  2 --
> >  5 files changed, 35 insertions(+), 25 deletions(-)
> >
> > diff --git a/builtin/branch.c b/builtin/branch.c
> > index bcc00bcf182d..5c797e992aa4 100644
> > --- a/builtin/branch.c
> > +++ b/builtin/branch.c
> > @@ -411,6 +411,8 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
> >  {
> >       int i;
> >       struct ref_array array;
> > +     struct strbuf out = STRBUF_INIT;
> > +     struct strbuf err = STRBUF_INIT;
> >       int maxwidth = 0;
> >       const char *remote_prefix = "";
> >       char *to_free = NULL;
> > @@ -440,8 +442,7 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
> >       ref_array_sort(sorting, &array);
> >
> >       for (i = 0; i < array.nr; i++) {
> > -             struct strbuf out = STRBUF_INIT;
> > -             struct strbuf err = STRBUF_INIT;
> > +             strbuf_reset(&out);
> >               if (format_ref_array_item(array.items[i], format, &out, &err))
>
> This function didn't call show_ref_array_item() to begin with, so
> strictly speaking it's not related to change in the title.  It is a
> preexisting example of show_ref_array_item() not being flexible enough,
> though.  I think it makes sense to have separate patches for inlining
> the function verbatim and reusing the output buffer when
> format_ref_array_item() is called in a loop.
>

I agree with you. I will divide this into a separate patch.

> >                       die("%s", err.buf);
> >               if (column_active(colopts)) {
> > @@ -452,10 +453,9 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
> >                       fwrite(out.buf, 1, out.len, stdout);
> >                       putchar('\n');
> >               }
> > -             strbuf_release(&err);
> > -             strbuf_release(&out);
> >       }
> >
> > +     strbuf_release(&out);
>
> err is no longer released, and it is also not reset in the loop.
> That change is not mentioned in the commit message, but it should.
> Why is it safe?  Probably because format_ref_array_item() only
> populates it if it also returns non-zero and then we end up dying
> anyway.
>
> That makes leak checking harder, though -- it's not easy to see if
> err hasn't simply been forgotten to be released.  I'd just retain
> the strbuf_release() call at the end of the function -- it
> shouldn't have a measurable performance impact and documents that
> this function is cleaning up after itself.  Thoughts?
>

Makes sense. Perhaps future changes will forget the release of err buffer.
I will add `strbuf_release()` here.

Thanks.
--
ZheNing Hu




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux