René Scharfe <l.s.r@xxxxxx> 于2021年4月7日周三 上午2:34写道: > > Am 05.04.21 um 16:01 schrieb ZheNing Hu via GitGitGadget: > > From: ZheNing Hu <adlternative@xxxxxxxxx> > > > > When we use `git for-each-ref`, every ref will call > > `show_ref_array_item()` and allocate its own final strbuf > > and error strbuf. Instead, we can provide two single strbuf: > > final_buf and error_buf that get reused for each output. > > > > When run it 100 times: > > > > $ git for-each-ref > > > > on git.git : > > > > 3.19s user > > 3.88s system > > 35% cpu > > 20.199 total > > > > to: > > > > 2.89s user > > 4.00s system > > 34% cpu > > 19.741 total > > > > The performance has been slightly improved. > > I like to use hyperfine (https://github.com/sharkdp/hyperfine) to get > more stable benchmark numbers, incl. standard deviation. With three > warmup runs I get the following results for running git for-each-ref on > Git's own repo with the current master (2e36527f23): > Yes, hyperfine is really easy to use! > Benchmark #1: ./git for-each-ref > Time (mean ± σ): 18.8 ms ± 0.3 ms [User: 12.7 ms, System: 5.6 ms] > Range (min … max): 18.2 ms … 19.8 ms 148 runs > > With your patch on top I get this: > > Benchmark #1: ./git for-each-ref > Time (mean ± σ): 18.5 ms ± 0.4 ms [User: 12.3 ms, System: 5.6 ms] > Range (min … max): 17.8 ms … 19.6 ms 147 runs > > So there seems to be a slight improvement here, but it is within the > noise. > Yeah. I meet same noise when I do such test. > I'm quite surprised how much longer this takes on your machine, however, > and (like Peff already mentioned) how much of the total time it spends > in system calls. Is an antivirus program or similar interferring? Or > some kind of emulator or similar, e.g. Valgrind? Or has it been a long > time since you ran "git gc"? > Yes, I haven't used `git gc` for a long time. In addition, when I did the test before, I ran the network proxy software, so there have a bit notice. > The benchmark certainly depends on the number of local and remote > branches in the repo; my copy currently has 4304 according to > "git for-each-ref | wc -l". > Yes i understand this point. But In my git.git, the result of "git for-each-ref | wc -l" is 8716 refs. > > > > Signed-off-by: ZheNing Hu <adlternative@xxxxxxxxx> > > --- > > [GSOC] ref-filter: use single strbuf for all output > > > > This patch learned Jeff King's optimization measures in git > > cat-file(79ed0a5): using a single strbuf for all objects output Instead > > of allocating a large number of small strbuf for every object. > > > > So ref-filter can learn same thing: use single buffer: final_buf and > > error_buf for all refs output. > > > > Thanks. > > > > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-927%2Fadlternative%2Fref-filter-single-buf-v1 > > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-927/adlternative/ref-filter-single-buf-v1 > > Pull-Request: https://github.com/gitgitgadget/git/pull/927 > > > > builtin/for-each-ref.c | 4 +++- > > builtin/tag.c | 4 +++- > > ref-filter.c | 21 ++++++++++++--------- > > ref-filter.h | 5 ++++- > > 4 files changed, 22 insertions(+), 12 deletions(-) > > > > diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c > > index cb9c81a04606..9dc41f48bfa0 100644 > > --- a/builtin/for-each-ref.c > > +++ b/builtin/for-each-ref.c > > @@ -22,6 +22,8 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix) > > struct ref_array array; > > struct ref_filter filter; > > struct ref_format format = REF_FORMAT_INIT; > > + struct strbuf final_buf = STRBUF_INIT; > > + struct strbuf error_buf = STRBUF_INIT; > > > > struct option opts[] = { > > OPT_BIT('s', "shell", &format.quote_style, > > @@ -81,7 +83,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix) > > if (!maxcount || array.nr < maxcount) > > maxcount = array.nr; > > for (i = 0; i < maxcount; i++) > > - show_ref_array_item(array.items[i], &format); > > + show_ref_array_item(array.items[i], &format, &final_buf, &error_buf); > > This user of show_ref_array_item() calls it in a loop on an array. > > > ref_array_clear(&array); > > return 0; > > } > > diff --git a/builtin/tag.c b/builtin/tag.c > > index d403417b5625..8a38b3e2de34 100644 > > --- a/builtin/tag.c > > +++ b/builtin/tag.c > > @@ -39,6 +39,8 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting, > > struct ref_format *format) > > { > > struct ref_array array; > > + struct strbuf final_buf = STRBUF_INIT; > > + struct strbuf error_buf = STRBUF_INIT; > > char *to_free = NULL; > > int i; > > > > @@ -64,7 +66,7 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting, > > ref_array_sort(sorting, &array); > > > > for (i = 0; i < array.nr; i++) > > - show_ref_array_item(array.items[i], format); > > + show_ref_array_item(array.items[i], format, &final_buf, &error_buf); > > Dito. > > > ref_array_clear(&array); > > free(to_free); > > > > diff --git a/ref-filter.c b/ref-filter.c > > index f0bd32f71416..51ff6af64ebc 100644 > > --- a/ref-filter.c > > +++ b/ref-filter.c > > @@ -2436,16 +2436,16 @@ int format_ref_array_item(struct ref_array_item *info, > > } > > > > void show_ref_array_item(struct ref_array_item *info, > > - const struct ref_format *format) > > + const struct ref_format *format, > > + struct strbuf *final_buf, > > + struct strbuf *error_buf) > > { > > - struct strbuf final_buf = STRBUF_INIT; > > - struct strbuf error_buf = STRBUF_INIT; > > > > - if (format_ref_array_item(info, format, &final_buf, &error_buf)) > > - die("%s", error_buf.buf); > > - fwrite(final_buf.buf, 1, final_buf.len, stdout); > > - strbuf_release(&error_buf); > > - strbuf_release(&final_buf); > > + if (format_ref_array_item(info, format, final_buf, error_buf)) > > + die("%s", error_buf->buf); > > + fwrite(final_buf->buf, 1, final_buf->len, stdout); > > + strbuf_reset(error_buf); > > + strbuf_reset(final_buf); > > putchar('\n'); > > } > > > > @@ -2453,9 +2453,12 @@ void pretty_print_ref(const char *name, const struct object_id *oid, > > const struct ref_format *format) > > { > > struct ref_array_item *ref_item; > > + struct strbuf final_buf = STRBUF_INIT; > > + struct strbuf error_buf = STRBUF_INIT; > > + > > ref_item = new_ref_array_item(name, oid); > > ref_item->kind = ref_kind_from_refname(name); > > - show_ref_array_item(ref_item, format); > > + show_ref_array_item(ref_item, format, &final_buf, &error_buf); > > This third and final caller works with a single item; there is no loop. > > > free_array_item(ref_item); > > } > > > > diff --git a/ref-filter.h b/ref-filter.h > > index 19ea4c413409..95498c9f4467 100644 > > --- a/ref-filter.h > > +++ b/ref-filter.h > > @@ -120,7 +120,10 @@ int format_ref_array_item(struct ref_array_item *info, > > struct strbuf *final_buf, > > struct strbuf *error_buf); > > /* Print the ref using the given format and quote_style */ > > -void show_ref_array_item(struct ref_array_item *info, const struct ref_format *format); > > +void show_ref_array_item(struct ref_array_item *info, > > + const struct ref_format *format, > > + struct strbuf *final_buf, > > + struct strbuf *error_buf); > > This bring-your-own-buffer approach pushes responsibilities back to > the callers, in exchange for improved performance. The number of > users of this interface is low, so that's defensible. But that added > effort is also non-trivial -- as you demonstrated by leaking the > allocated memory. ;-) > Yes, this may be burden for the function caller. > How about offering to do more instead? In particular you could add > a count parameter and have show_ref_array_item() handle an array of > struct ref_array_item objects. It could reuse the buffers internally > to get the same performance benefit, and would free callers from > having to iterate loops themselves. Something like: > > void show_ref_array_items(struct ref_array_item **info, > size_t n, > const struct ref_format *format); > > Callers that deal with a single element can pass n = 1. > > Perhaps the "format" parameter should go first, like with printf. > > The double reference in "**info" is a bit ugly, though (array of > pointers instead of a simple array of objects). That's dictated > by struct ref_array_item containing a flexible array member, which > seems to be hard to change. > I personally think this idea is great. In this way, there is no need to pass in two strbuf from the outside. +void show_ref_array_items(struct ref_array_item **info, + const struct ref_format *format, + size_t n) +{ + struct strbuf final_buf = STRBUF_INIT; + struct strbuf error_buf = STRBUF_INIT; + size_t i; + + for (i = 0; i < n; i++) { + if (format_ref_array_item(info[i], format, &final_buf, &error_buf)) + die("%s", error_buf.buf); + fwrite(final_buf.buf, 1, final_buf.len, stdout); + strbuf_reset(&error_buf); + strbuf_reset(&final_buf); + putchar('\n'); + } + strbuf_release(&error_buf); + strbuf_release(&final_buf); +} + And the result is here(close the network proxy program): HEAD~ result : (git)-[heads/master] % hyperfine "./bin-wrappers/git for-each-ref" --warmup=10 Benchmark #1: ./bin-wrappers/git for-each-ref Time (mean ± σ): 18.7 ms ± 0.4 ms [User: 14.9 ms, System: 3.9 ms] Range (min … max): 18.1 ms … 19.8 ms 141 runs With the new patch : (git)-[ref-filter-single-buf] % hyperfine "./bin-wrappers/git for-each-ref" --warmup=10 Benchmark #1: ./bin-wrappers/git for-each-ref Time (mean ± σ): 18.2 ms ± 0.3 ms [User: 14.1 ms, System: 4.2 ms] Range (min … max): 17.4 ms … 19.2 ms 140 runs Seem like it does have some small advantages ;-) > > /* Parse a single sort specifier and add it to the list */ > > void parse_ref_sorting(struct ref_sorting **sorting_tail, const char *atom); > > /* Callback function for parsing the sort option */ > > > > base-commit: 2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48 > > > A new iteration will be sent later. Thanks! -- ZheNing Hu