Re: [PATCH v6 4/4] cat-file: add --batch-command mode

John Cai <johncai86@xxxxxxxxx> · Tue, 15 Feb 2022 17:58:41 -0500

Hi Eric,

Thanks for taking another look!

On 15 Feb 2022, at 14:39, Eric Sunshine wrote:

> On Mon, Feb 14, 2022 at 1:23 PM John Cai via GitGitGadget
> <gitgitgadget@xxxxxxxxx> wrote:
>> Add a new flag --batch-command that accepts commands and arguments
>> from stdin, similar to git-update-ref --stdin.
>
> Some relatively minor comments below. Not sure any of them are serious
> enough to warrant a reroll...
>
>> The contents command takes an <object> argument and prints out the object
>> contents.
>>
>> The info command takes a <object> argument and prints out the object
>> metadata.
>
> s/a <object>/an <object>/
>
>> Signed-off-by: John Cai <johncai86@xxxxxxxxx>
>> ---
>> diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
>> @@ -96,6 +96,33 @@ OPTIONS
>> +contents `<object>`::
>> +       Print object contents for object reference `<object>`. This corresponds to
>> +       the output of `--batch`.
>> +
>> +info `<object>`::
>> +       Print object info for object reference `<object>`. This corresponds to the
>> +       output of `--batch-check`.
>
> Sorry if I wasn't clear in my earlier review, but when I suggested
> s/<object>/`<object>`/, I was referring only to the body of each item,
> not to the item itself (for which we do not -- I think -- ever use
> `<...>`). So:
>
>     content <object>::
>         Print object contents ... `<object>`. ...
>
> As mentioned in my earlier review, I think the SYNOPSIS also needs an
> update to mention --batch-command.

:face-palm: yes I forgot about that in the last version.

>
>> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
>> @@ -513,6 +514,129 @@ static int batch_unordered_packed(const struct object_id *oid,
>> +static void dispatch_calls(struct batch_options *opt,
>> +               struct strbuf *output,
>> +               struct expand_data *data,
>> +               struct queued_cmd *cmd,
>> +               int nr)
>> +{
>> +       int i;
>> +
>> +       for (i = 0; i < nr; i++){
>
> Style: for (...) {
>
>> +               cmd[i].fn(opt, cmd[i].line, output, data);
>> +               free(cmd[i].line);
>> +       }
>> +
>> +       fflush(stdout);
>> +}
>
> If I recall correctly, Junio suggested calling free() within this
> loop, but this feels like an incorrect separation of concerns since it
> doesn't also reset the caller's `nr` to 0. If (for some reason),
> dispatch_calls() is invoked twice in a row without resetting `nr` to 0
> in between the calls, then the dispatched commands will be called with
> a pointer to freed memory.
>
> One somewhat ugly way to fix this potential problem would be for this
> function to clear `nr`:
>
>     static void dispatch_calls(..., int *nr)
>     {
>         for (...) {
>             cmd[i].fn(...);
>             free(cmd[i].line);
>         }
>         *nr = 0;
>         flush(stdout);
>     }
>
> But, as this is a private helper, the code as presented in the patch
> may be "good enough" even though it's a bit fragile.

What you suggested makes sense from a separation of concerns point of view. I'm
still learning what looks ugly in C :), but I think this is easier to read
overall than what I had before.

>
>> +static void batch_objects_command(struct batch_options *opt,
>> +                                   struct strbuf *output,
>> +                                   struct expand_data *data)
>> +{
>> +       while (!strbuf_getline(&input, stdin)) {
>> +               if (!input.len)
>> +                       die(_("empty command in input"));
>> +               if (isspace(*input.buf))
>> +                       die(_("whitespace before command: '%s'"), input.buf);
>> +
>> +               if (skip_prefix(input.buf, "flush", &cmd_end)) {
>> +                       if (!opt->buffer_output)
>> +                               die(_("flush is only for --buffer mode"));
>> +                       if (*cmd_end)
>> +                               die(_("flush takes no arguments"));
>
> I didn't articulate it in my previous review since the thought was
> only half-formed, but given "flushify", this will incorrectly complain
> that "flush takes no arguments" instead of complaining "unknown
> command: flushify" as is done below (or it will incorrectly complain
> "flush is only for --buffer mode" if --buffer wasn't specified).
>
> If I'm reading the code correctly, it seems as if these problems could
> be avoided by treating `flush` as just another parse_cmd::commands[]
> item so that it gets all the same parsing/checking as the other
> commands rather than parsing it separately here.

This is a good idea. I like the reduced complexity.

>
>> +                       dispatch_calls(opt, output, data, queued_cmd, nr);
>> +                       nr = 0;
>> +                       continue;
>> +               }
>> +
>> +               for (i = 0; i < ARRAY_SIZE(commands); i++) {
>> +                       if (!skip_prefix(input.buf, commands[i].prefix, &cmd_end))
>> +                               continue;
>> +
>> +                       cmd = &commands[i];
>> +                       if (cmd->takes_args) {
>> +                               if (*cmd_end != ' ')
>> +                                       die(_("%s requires arguments"),
>> +                                           commands[i].prefix);
>> +
>> +                               p = cmd_end + 1;
>> +                       } else if (*cmd_end) {
>> +                               die(_("%s takes no arguments"),
>> +                                   commands[i].prefix);
>> +                       }
>
> Good. Appears to be correctly handling the full matrix of
> command-requires-arguments and the actual input having or not having
> arguments.
>
>> +                       break;
>> +               }
>> +
>> +               if (!cmd)
>> +                       die(_("unknown command: '%s'"), input.buf);
>
> If you treat `flush` as just another parse_cmd::commands[], then right
> here is where you would handle it (I think):
>
>     if (strcmp(cmd->prefix, "flush")) {
>         dispatch_calls(opt, output, data, queued_cmd, nr);
>         nr = 0;
>         continue;
>     }
>
>> +               if (!opt->buffer_output) {
>> +                       cmd->fn(opt, p, output, data);
>> +                       continue;
>> +               }
>> +
>> +               ALLOC_GROW(queued_cmd, nr + 1, alloc);
>> +               call.fn = cmd->fn;
>> +               call.line = xstrdup_or_null(p);
>> +               queued_cmd[nr++] = call;
>> +       }
>> +
>> +       if (opt->buffer_output &&
>> +           nr &&
>> +           !git_env_bool("GIT_TEST_CAT_FILE_NO_FLUSH_ON_EXIT", 0))
>> +               dispatch_calls(opt, output, data, queued_cmd, nr);
>> +
>> +       free(queued_cmd);
>> +       strbuf_release(&input);
>> +}