Hi Taylor On 8 Mar 2022, at 17:00, Taylor Blau wrote: > On Tue, Mar 08, 2022 at 02:54:23AM +0000, John Cai via GitGitGadget wrote: >> /* >> * If "pack" is non-NULL, then "offset" is the byte offset within the pack from >> * which the object may be accessed (though note that we may also rely on >> @@ -363,6 +371,11 @@ static void batch_object_write(const char *obj_name, >> struct packed_git *pack, >> off_t offset) >> { >> + struct strbuf type_name = STRBUF_INIT; >> + >> + if (!opt->format) >> + data->info.type_name = &type_name; > > Hmmm, I'm not quite sure I understand why the extra string buffer is > necessary here. When we do the first expansion with the DEFAULT_FORMAT, > we set data->info.typep to point at data->type. > > So by the time we do our actual query (either with > `packed_object_info()` or just `oid_object_info_extended()`), we will > get the type filled into data->type. > > And we should be able to use that in print_default_format(), which saves > us in a couple of ways: > > - We don't have to (stack) allocate a string buffer, which then needs > to be cleaned up after printing each object. > > - (More importantly) we can avoid the extra string copy through the > intermediate buffer by using type_name() directly. Agree with your reasoning here. > > Here's a small patch on top that you could apply, if you're interested: > > --- >8 --- > > > diff --git a/builtin/cat-file.c b/builtin/cat-file.c > index ab9a49e13a..9f55afa75b 100644 > --- a/builtin/cat-file.c > +++ b/builtin/cat-file.c > @@ -355,5 +355,5 @@ static int print_default_format(char *buf, int len, struct expand_data *data) > { > return xsnprintf(buf, len, "%s %s %"PRIuMAX"\n", oid_to_hex(&data->oid), > - data->info.type_name->buf, > + type_name(data->type), > (uintmax_t)*data->info.sizep); > > @@ -372,9 +372,4 @@ static void batch_object_write(const char *obj_name, > off_t offset) > { > - struct strbuf type_name = STRBUF_INIT; > - > - if (!opt->format) > - data->info.type_name = &type_name; > - > if (!data->skip_object_info) { > int ret; > @@ -391,5 +386,5 @@ static void batch_object_write(const char *obj_name, > obj_name ? obj_name : oid_to_hex(&data->oid)); > fflush(stdout); > - goto cleanup; > + return; > } > } > @@ -410,7 +405,4 @@ static void batch_object_write(const char *obj_name, > batch_write(opt, "\n", 1); > } > - > -cleanup: > - strbuf_release(&type_name); > } > > --- 8< --- > > On my copy of git.git., it shaves off around ~7ms that we're spending > just copying type names back and forth. > > (without the above) > Benchmark 1: git.compile cat-file --batch-check --batch-all-objects > Time (mean ± σ): 578.7 ms ± 12.7 ms [User: 553.4 ms, System: 25.2 ms] > Range (min … max): 568.1 ms … 600.1 ms 10 runs > > (with the above) > Benchmark 1: git.compile cat-file --batch-check --batch-all-objects > Time (mean ± σ): 571.5 ms ± 7.9 ms [User: 544.0 ms, System: 27.3 ms] > Range (min … max): 558.8 ms … 583.2 ms 10 runs Thanks for this suggestion and the benchmark! This is similar to what Junio suggested, and I do think this is the right thing to do here. > > Thanks, > Taylor Thanks, John