Re: [PATCH v6 4/4] cat-file: add --batch-command mode

Eric Sunshine <sunshine@xxxxxxxxxxxxxx> · Tue, 15 Feb 2022 14:39:27 -0500

On Mon, Feb 14, 2022 at 1:23 PM John Cai via GitGitGadget
<gitgitgadget@xxxxxxxxx> wrote:
> Add a new flag --batch-command that accepts commands and arguments
> from stdin, similar to git-update-ref --stdin.

Some relatively minor comments below. Not sure any of them are serious
enough to warrant a reroll...

> The contents command takes an <object> argument and prints out the object
> contents.
>
> The info command takes a <object> argument and prints out the object
> metadata.

s/a <object>/an <object>/

> Signed-off-by: John Cai <johncai86@xxxxxxxxx>
> ---
> diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> @@ -96,6 +96,33 @@ OPTIONS
> +contents `<object>`::
> +       Print object contents for object reference `<object>`. This corresponds to
> +       the output of `--batch`.
> +
> +info `<object>`::
> +       Print object info for object reference `<object>`. This corresponds to the
> +       output of `--batch-check`.

Sorry if I wasn't clear in my earlier review, but when I suggested
s/<object>/`<object>`/, I was referring only to the body of each item,
not to the item itself (for which we do not -- I think -- ever use
`<...>`). So:

    content <object>::
        Print object contents ... `<object>`. ...

As mentioned in my earlier review, I think the SYNOPSIS also needs an
update to mention --batch-command.

> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> @@ -513,6 +514,129 @@ static int batch_unordered_packed(const struct object_id *oid,
> +static void dispatch_calls(struct batch_options *opt,
> +               struct strbuf *output,
> +               struct expand_data *data,
> +               struct queued_cmd *cmd,
> +               int nr)
> +{
> +       int i;
> +
> +       for (i = 0; i < nr; i++){

Style: for (...) {

> +               cmd[i].fn(opt, cmd[i].line, output, data);
> +               free(cmd[i].line);
> +       }
> +
> +       fflush(stdout);
> +}

If I recall correctly, Junio suggested calling free() within this
loop, but this feels like an incorrect separation of concerns since it
doesn't also reset the caller's `nr` to 0. If (for some reason),
dispatch_calls() is invoked twice in a row without resetting `nr` to 0
in between the calls, then the dispatched commands will be called with
a pointer to freed memory.

One somewhat ugly way to fix this potential problem would be for this
function to clear `nr`:

    static void dispatch_calls(..., int *nr)
    {
        for (...) {
            cmd[i].fn(...);
            free(cmd[i].line);
        }
        *nr = 0;
        flush(stdout);
    }

But, as this is a private helper, the code as presented in the patch
may be "good enough" even though it's a bit fragile.

> +static void batch_objects_command(struct batch_options *opt,
> +                                   struct strbuf *output,
> +                                   struct expand_data *data)
> +{
> +       while (!strbuf_getline(&input, stdin)) {
> +               if (!input.len)
> +                       die(_("empty command in input"));
> +               if (isspace(*input.buf))
> +                       die(_("whitespace before command: '%s'"), input.buf);
> +
> +               if (skip_prefix(input.buf, "flush", &cmd_end)) {
> +                       if (!opt->buffer_output)
> +                               die(_("flush is only for --buffer mode"));
> +                       if (*cmd_end)
> +                               die(_("flush takes no arguments"));

I didn't articulate it in my previous review since the thought was
only half-formed, but given "flushify", this will incorrectly complain
that "flush takes no arguments" instead of complaining "unknown
command: flushify" as is done below (or it will incorrectly complain
"flush is only for --buffer mode" if --buffer wasn't specified).

If I'm reading the code correctly, it seems as if these problems could
be avoided by treating `flush` as just another parse_cmd::commands[]
item so that it gets all the same parsing/checking as the other
commands rather than parsing it separately here.

> +                       dispatch_calls(opt, output, data, queued_cmd, nr);
> +                       nr = 0;
> +                       continue;
> +               }
> +
> +               for (i = 0; i < ARRAY_SIZE(commands); i++) {
> +                       if (!skip_prefix(input.buf, commands[i].prefix, &cmd_end))
> +                               continue;
> +
> +                       cmd = &commands[i];
> +                       if (cmd->takes_args) {
> +                               if (*cmd_end != ' ')
> +                                       die(_("%s requires arguments"),
> +                                           commands[i].prefix);
> +
> +                               p = cmd_end + 1;
> +                       } else if (*cmd_end) {
> +                               die(_("%s takes no arguments"),
> +                                   commands[i].prefix);
> +                       }

Good. Appears to be correctly handling the full matrix of
command-requires-arguments and the actual input having or not having
arguments.

> +                       break;
> +               }
> +
> +               if (!cmd)
> +                       die(_("unknown command: '%s'"), input.buf);

If you treat `flush` as just another parse_cmd::commands[], then right
here is where you would handle it (I think):

    if (strcmp(cmd->prefix, "flush")) {
        dispatch_calls(opt, output, data, queued_cmd, nr);
        nr = 0;
        continue;
    }

> +               if (!opt->buffer_output) {
> +                       cmd->fn(opt, p, output, data);
> +                       continue;
> +               }
> +
> +               ALLOC_GROW(queued_cmd, nr + 1, alloc);
> +               call.fn = cmd->fn;
> +               call.line = xstrdup_or_null(p);
> +               queued_cmd[nr++] = call;
> +       }
> +
> +       if (opt->buffer_output &&
> +           nr &&
> +           !git_env_bool("GIT_TEST_CAT_FILE_NO_FLUSH_ON_EXIT", 0))
> +               dispatch_calls(opt, output, data, queued_cmd, nr);
> +
> +       free(queued_cmd);
> +       strbuf_release(&input);
> +}