Hi Taylor, On 8 Mar 2022, at 17:30, Taylor Blau wrote: > On Tue, Mar 08, 2022 at 10:08:46PM +0000, John Cai via GitGitGadget wrote: >> diff --git a/builtin/cat-file.c b/builtin/cat-file.c >> index 7b3f42950ec..e2edba70b41 100644 >> --- a/builtin/cat-file.c >> +++ b/builtin/cat-file.c >> @@ -351,6 +351,13 @@ static void print_object_or_die(struct batch_options *opt, struct expand_data *d >> } >> } >> >> +static int print_default_format(char *buf, int len, struct expand_data *data) >> +{ >> + return xsnprintf(buf, len, "%s %s %"PRIuMAX"\n", oid_to_hex(&data->oid), >> + type_name(data->type), >> + (uintmax_t)*data->info.sizep); >> +} > > Two small nits here. It looks like the indentation on the second and > third lines is off a little bit, since we'd typically expect those to be > indented to the same margin as the first argument to xsnprintf(). Thanks for bringing this up. I did have a question about indentation in this case. for the second line, I did try to indent it to align with buf. I attempted to do the same with the third line, but it's the ( that lines up with buf so optically it looks a little off. > > The other is that you're reading data->info.sizep by dereferencing it, > but we know that it points to data->size. So I think there it makes > sense to just read the value directly out of data->size, though note > that you'll still need the cast to uintmax_t since you're formatting it > with PRIuMAX. good point, I'll adjust this in the next version. > >> + >> /* >> * If "pack" is non-NULL, then "offset" is the byte offset within the pack from >> * which the object may be accessed (though note that we may also rely on >> @@ -381,10 +388,16 @@ static void batch_object_write(const char *obj_name, >> } >> } >> >> - strbuf_reset(scratch); >> - strbuf_expand(scratch, opt->format, expand_format, data); >> - strbuf_addch(scratch, '\n'); >> - batch_write(opt, scratch->buf, scratch->len); >> + if (!opt->format) { >> + char buf[1024]; >> + int len = print_default_format(buf, 1024, data); >> + batch_write(opt, buf, len); > > Just curious (and apologies if this was discussed earlier and I missed > it), but: is there a reason that we have to use a scratch buffer here > that is separate from the strbuf we already have allocated? > > That would avoid a large-ish stack variable, but it means that the two > paths are a little more similar, and can share the batch_write call > outside of the if/else statement. This was holdover code from before. Looks like the scratch buffer gets passed in. Do you mean we don't need to allocate char buf[1024] and instead we can just use scratch and pass it into print_default_format? > > The rest of the changes in this file all look good to me. > >> diff --git a/t/perf/p1006-cat-file.sh b/t/perf/p1006-cat-file.sh >> new file mode 100755 >> index 00000000000..e463623f5a3 >> --- /dev/null >> +++ b/t/perf/p1006-cat-file.sh >> @@ -0,0 +1,16 @@ >> +#!/bin/sh >> + >> +test_description='Basic sort performance tests' > > Is this description a hold-over from p0071? If so, it may be worth > updating here. > >> +test_expect_success 'setup' ' >> + git rev-list --all >rla >> +' >> + >> +test_perf 'cat-file --batch-check' ' >> + git cat-file --batch-check <rla >> +' > > We could probably get away with dropping the setup test and using > `--batch-all-objects` here. Note that right now you're only printing > commit objects, so there would be a slight behavior change from the way > the patch is currently written, but it should demonstrate the same > performance improvement. This sounds good to me! > > Thanks, > Taylor