René Scharfe. <l.s.r@xxxxxx> 于2021年4月17日周六 下午5:11写道: > > Am 09.04.21 um 15:35 schrieb ZheNing Hu via GitGitGadget: > > From: ZheNing Hu <adlternative@xxxxxxxxx> > > > > When we use `git for-each-ref`, every ref will call > > `show_ref_array_item()` and allocate its own final strbuf. > > But we can reuse the final strbuf for each step ref's output. > > Since `show_ref_array_item()` is not used in many places, > > we can directly delete `show_ref_array_item()` and use the > > same logic code to replace it. In this way, the caller can > > clearly see how the loop work. > > Inlining an exported function that is not providing the right level of > abstraction is a bold move that simplifies the API and can unlock > improvements at the former call sites, like the possibility to reuse an > allocated buffer in this case. OK. > > > The performance for `git for-each-ref` on the Git repository > > itself with performance testing tool `hyperfine` changes from > > 23.7 ms ± 0.9 ms to 22.2 ms ± 1.0 ms. > > I see a speedup as well, but it's within the noise. > Yes, the performance improvement is very small under a large number of refs. It was almost completely drowned out by the noise. > > At the same time, we apply this optimization to `git tag -l` > > and `git branch -l`, the `git branch -l` performance upgrade > > from 5.8 ms ± 0.8 ms to 2.7 ms ± 0.2 ms and `git tag -l` > > performance upgrade from 5.9 ms ± 0.4 ms to 5.4 ms ± 0.4 ms. > > On my system there's no measurable change with these commands. > In our case, git branch -l has made obvious progress, but it may be because the number of branches is far less than tags. > Nevertheless I think reusing the buffer across the loops is a good > idea. > > > Since the number of tags in git.git is much more than branches, > > so this shows that the optimization will be more obvious in > > those repositories that contain a small number of objects. > > > > This approach is similar to the one used by 79ed0a5 > > (cat-file: use a single strbuf for all output, 2018-08-14) > > to speed up the cat-file builtin. > > > > Signed-off-by: ZheNing Hu <adlternative@xxxxxxxxx> > > --- > > [GSOC] ref-filter: get rid of show_ref_array_item > > > > Now git for-each-ref can reuse final buf for all refs output, the > > performance is slightly improved, This optimization is also applied to > > git tag -l and git branch -l. > > > > Thanks. > > > > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-928%2Fadlternative%2Fref-filter-reuse-buf-v1 > > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-928/adlternative/ref-filter-reuse-buf-v1 > > Pull-Request: https://github.com/gitgitgadget/git/pull/928 > > > > builtin/branch.c | 8 ++++---- > > builtin/for-each-ref.c | 13 +++++++++++-- > > builtin/tag.c | 13 +++++++++++-- > > ref-filter.c | 24 +++++++++--------------- > > ref-filter.h | 2 -- > > 5 files changed, 35 insertions(+), 25 deletions(-) > > > > diff --git a/builtin/branch.c b/builtin/branch.c > > index bcc00bcf182d..5c797e992aa4 100644 > > --- a/builtin/branch.c > > +++ b/builtin/branch.c > > @@ -411,6 +411,8 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin > > { > > int i; > > struct ref_array array; > > + struct strbuf out = STRBUF_INIT; > > + struct strbuf err = STRBUF_INIT; > > int maxwidth = 0; > > const char *remote_prefix = ""; > > char *to_free = NULL; > > @@ -440,8 +442,7 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin > > ref_array_sort(sorting, &array); > > > > for (i = 0; i < array.nr; i++) { > > - struct strbuf out = STRBUF_INIT; > > - struct strbuf err = STRBUF_INIT; > > + strbuf_reset(&out); > > if (format_ref_array_item(array.items[i], format, &out, &err)) > > This function didn't call show_ref_array_item() to begin with, so > strictly speaking it's not related to change in the title. It is a > preexisting example of show_ref_array_item() not being flexible enough, > though. I think it makes sense to have separate patches for inlining > the function verbatim and reusing the output buffer when > format_ref_array_item() is called in a loop. > I agree with you. I will divide this into a separate patch. > > die("%s", err.buf); > > if (column_active(colopts)) { > > @@ -452,10 +453,9 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin > > fwrite(out.buf, 1, out.len, stdout); > > putchar('\n'); > > } > > - strbuf_release(&err); > > - strbuf_release(&out); > > } > > > > + strbuf_release(&out); > > err is no longer released, and it is also not reset in the loop. > That change is not mentioned in the commit message, but it should. > Why is it safe? Probably because format_ref_array_item() only > populates it if it also returns non-zero and then we end up dying > anyway. > > That makes leak checking harder, though -- it's not easy to see if > err hasn't simply been forgotten to be released. I'd just retain > the strbuf_release() call at the end of the function -- it > shouldn't have a measurable performance impact and documents that > this function is cleaning up after itself. Thoughts? > Makes sense. Perhaps future changes will forget the release of err buffer. I will add `strbuf_release()` here. Thanks. -- ZheNing Hu