Re: [PATCH v4 8/8] cat-file: add --batch-command remote-object-info command

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Calvin Wan <calvinwan@xxxxxxxxxx> 于2022年5月3日周二 06:33写道:
>
> Since the `info` command in cat-file --batch-command prints object info
> for a given object, it is natural to add another command in cat-file
> --batch-command to print object info for a given object from a remote.
> Add `remote-object-info` to cat-file --batch-command.
>
> While `info` takes object ids one at a time, this creates overhead when
> making requests to a server so `remote-object-info` instead can take
> multiple object ids at once.
>
> cat-file --batch-command is generally implemented in the following
> manner:
>
>  - Receive and parse input from user
>  - Call respective function attached to command
>  - Set batch mode state, get object info, print object info
>
> In --buffer mode, this changes to:
>
>  - Receive and parse input from user
>  - Store respective function attached to command in a queue
>  - After flush, loop through commands in queue
>     - Call respective function attached to command
>     - Set batch mode state, get object info, print object info
>
> Notice how the getting and printing of object info is accomplished one
> at a time. As described above, this creates a problem for making
> requests to a server. Therefore, `remote-object-info` is implemented in
> the following manner:
>
>  - Receive and parse input from user
>  If command is `remote-object-info`:
>     - Get object info from remote
>     - Loop through object info
>         - Call respective function attached to `info`
>         - Set batch mode state, use passed in object info, print object
>           info
>  Else:
>     - Call respective function attached to command
>     - Parse input, get object info, print object info
>
> And finally for --buffer mode `remote-object-info`:
>  - Receive and parse input from user
>  - Store respective function attached to command in a queue
>  - After flush, loop through commands in queue:
>     If command is `remote-object-info`:
>         - Get object info from remote
>         - Loop through object info
>             - Call respective function attached to `info`
>             - Set batch mode state, use passed in object info, print
>               object info
>     Else:
>         - Call respective function attached to command
>         - Set batch mode state, get object info, print object info
>
> To summarize, `remote-object-info` gets object info from the remote and
> then generates multiples `info` commands with the object info passed in.
> It cannot be implemented similarly to `info` and `content` because of
> the overhead with remote commenication.
>
> ---
>  Documentation/git-cat-file.txt |  16 +-
>  builtin/cat-file.c             | 200 +++++++++++++++----
>  t/t1006-cat-file.sh            | 347 +++++++++++++++++++++++++++++++++
>  3 files changed, 522 insertions(+), 41 deletions(-)
>
> diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> index 24a811f0ef..0d9e8e6214 100644
> --- a/Documentation/git-cat-file.txt
> +++ b/Documentation/git-cat-file.txt
> @@ -115,6 +115,10 @@ info <object>::
>         Print object info for object reference `<object>`. This corresponds to the
>         output of `--batch-check`.
>
> +remote-object-info <remote> [<object>]::
> +       Print object info for object references `<object>` at specified <remote>.
> +       This command may only be combined with `--buffer`.
> +
>  flush::
>         Used with `--buffer` to execute all preceding commands that were issued
>         since the beginning or since the last flush was issued. When `--buffer`
> @@ -245,7 +249,8 @@ newline. The available atoms are:
>         The full hex representation of the object name.
>
>  `objecttype`::
> -       The type of the object (the same as `cat-file -t` reports).
> +       The type of the object (the same as `cat-file -t` reports). See
> +       `CAVEATS` below.
>
>  `objectsize`::
>         The size, in bytes, of the object (the same as `cat-file -s`
> @@ -253,13 +258,14 @@ newline. The available atoms are:
>
>  `objectsize:disk`::
>         The size, in bytes, that the object takes up on disk. See the
> -       note about on-disk sizes in the `CAVEATS` section below.
> +       note about on-disk sizes in the `CAVEATS` section below. Not
> +       supported by `remote-object-info`.
>
>  `deltabase`::
>         If the object is stored as a delta on-disk, this expands to the
>         full hex representation of the delta base object name.
>         Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
> -       below.
> +       below. Not supported by `remote-object-info`.
>
>  `rest`::
>         If this atom is used in the output string, input lines are split

Why not forbidden %(rest) here too?

> +static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
> +{
> +       int retval = 0;
> +       size_t i;
> +       struct remote *remote = NULL;
> +       struct object_id oid;
> +       struct string_list object_info_options = STRING_LIST_INIT_NODUP;
> +       static struct transport *gtransport;
> +       char *format = DEFAULT_FORMAT;
> +
> +       if (opt->format)
> +               format = xstrdup(opt->format);
> +
> +       remote = remote_get(argv[0]);
> +       if (!remote)
> +               die(_("must supply valid remote when using --object-info"));
> +       oid_array_clear(&object_info_oids);
> +       for (i = 1; i < argc; i++) {
> +               if (get_oid(argv[i], &oid))
> +                       die(_("malformed object id '%s'"), argv[i]);
> +               oid_array_append(&object_info_oids, &oid);
> +       }
> +
> +       gtransport = transport_get(remote, NULL);
> +       if (gtransport->smart_options) {
> +               size_t j;
> +               int include_size = 0, include_type = 0;
> +
> +               remote_object_info = xcalloc(object_info_oids.nr, sizeof(struct object_info));
> +               gtransport->smart_options->object_info = 1;
> +               gtransport->smart_options->object_info_oids = &object_info_oids;
> +               /**
> +                * 'size' is the only option currently supported.
> +                * Other options that are passed in the format will default to a
> +                * standard fetch request rather than object-info.
> +                */
> +               if (strstr(format, "%(objectsize)")) {
> +                       string_list_append(&object_info_options, "size");
> +                       include_size = 1;
> +               }
> +               if (strstr(format, "%(objecttype)")) {
> +                       string_list_append(&object_info_options, "type");
> +                       include_type = 1;
> +               }
> +               if (strstr(format, "%(objectsize:disk)"))
> +                       die(_("objectsize:disk is currently not supported with remote-object-info"));
> +               if (strstr(format, "%(deltabase)"))
> +                       die(_("deltabase is currently not supported with remote-object-info"));

%(rest) too?

> +               if (object_info_options.nr > 0) {
> +                       gtransport->smart_options->object_info_options = &object_info_options;
> +                       for (j = 0; j < object_info_oids.nr; j++) {
> +                               if (include_size)
> +                                       remote_object_info[j].sizep = xcalloc(1, sizeof(long));
> +                               if (include_type)
> +                                       remote_object_info[j].typep = xcalloc(1, sizeof(enum object_type));
> +                       }
> +                       gtransport->smart_options->object_info_data = &remote_object_info;
> +                       retval = transport_fetch_refs(gtransport, NULL);
> +               }
> +       } else {
> +               retval = -1;
> +       }
> +       return retval;
> +}

Maybe such code style will be better?

        if (!gtransport->smart_options) {
               return -1;
        }
        ...
        return transport_fetch_refs(gtransport, NULL);

> +
>  struct object_cb_data {
>         struct batch_options *opt;
>         struct expand_data *expand;
> @@ -538,6 +611,7 @@ typedef void (*parse_cmd_fn_t)(struct batch_options *, const char *,
>  struct queued_cmd {
>         parse_cmd_fn_t fn;
>         char *line;
> +       const char *name;
>  };
>
>  static void parse_cmd_contents(struct batch_options *opt,
> @@ -565,9 +639,49 @@ static const struct parse_cmd {
>  } commands[] = {
>         { "contents", parse_cmd_contents, 1},
>         { "info", parse_cmd_info, 1},
> +       { "remote-object-info", parse_cmd_info, 1},
>         { "flush", NULL, 0},
>  };
>
> +static void parse_remote_info(struct batch_options *opt,
> +                          char *line,
> +                          struct strbuf *output,
> +                          struct expand_data *data,
> +                          const struct parse_cmd *p_cmd,
> +                          struct queued_cmd *q_cmd)
> +{
> +       int count;
> +       size_t i;
> +       const char **argv;
> +
> +       count = split_cmdline(line, &argv);
> +       if (get_remote_info(opt, count, argv))
> +               goto cleanup;
> +       opt->use_remote_info = 1;
> +       data->skip_object_info = 1;
> +       data->mark_query = 0;
> +       for (i = 0; i < object_info_oids.nr; i++) {
> +               if (remote_object_info[i].sizep)
> +                       data->size = *remote_object_info[i].sizep;
> +               if (remote_object_info[i].typep)
> +                       data->type = *remote_object_info[i].typep;
> +
> +               data->oid = object_info_oids.oid[i];
> +               if (p_cmd)
> +                       p_cmd->fn(opt, argv[i+1], output, data);
> +               else
> +                       q_cmd->fn(opt, argv[i+1], output, data);
> +       }
> +       opt->use_remote_info = 0;
> +       data->skip_object_info = 0;
> +       data->mark_query = 1;
> +
> +cleanup:
> +       for (i = 0; i < object_info_oids.nr; i++)
> +               free_object_info_contents(&remote_object_info[i]);
> +       free(remote_object_info);
> +}
> +
>  static void dispatch_calls(struct batch_options *opt,
>                 struct strbuf *output,
>                 struct expand_data *data,
> @@ -579,8 +693,12 @@ static void dispatch_calls(struct batch_options *opt,
>         if (!opt->buffer_output)
>                 die(_("flush is only for --buffer mode"));
>
> -       for (i = 0; i < nr; i++)
> -               cmd[i].fn(opt, cmd[i].line, output, data);
> +       for (i = 0; i < nr; i++) {
> +               if (!strcmp(cmd[i].name, "remote-object-info"))
> +                       parse_remote_info(opt, cmd[i].line, output, data, NULL, &cmd[i]);
> +               else
> +                       cmd[i].fn(opt, cmd[i].line, output, data);
> +       }
>
>         fflush(stdout);
>  }
> @@ -640,11 +758,17 @@ static void batch_objects_command(struct batch_options *opt,
>                         dispatch_calls(opt, output, data, queued_cmd, nr);
>                         free_cmds(queued_cmd, &nr);
>                 } else if (!opt->buffer_output) {
> -                       cmd->fn(opt, p, output, data);
> +                       if (!strcmp(cmd->name, "remote-object-info")) {
> +                               char *line = xstrdup_or_null(p);
> +                               parse_remote_info(opt, line, output, data, cmd, NULL);

memory leak: "line".

> +                       } else {
> +                               cmd->fn(opt, p, output, data);
> +                       }
>                 } else {
>                         ALLOC_GROW(queued_cmd, nr + 1, alloc);
>                         call.fn = cmd->fn;
>                         call.line = xstrdup_or_null(p);
> +                       call.name = cmd->name;
>                         queued_cmd[nr++] = call;
>                 }
>         }
> +
> +test_expect_success 'remote-object-info fails on unspported filter option (objectsize:disk)' '
> +       (
> +               cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> +
> +               test_must_fail git cat-file --batch-command="%(objectsize:disk)" 2>err <<-EOF &&
> +               remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
> +               EOF
> +               test_i18ngrep "objectsize:disk is currently not supported with remote-object-info" err
> +       )
> +'
> +
> +test_expect_success 'remote-object-info fails on unspported filter option (deltabase)' '
> +       (
> +               cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> +
> +               test_must_fail git cat-file --batch-command="%(deltabase)" 2>err <<-EOF &&
> +               remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
> +               EOF
> +               test_i18ngrep "deltabase is currently not supported with remote-object-info" err
> +       )
> +'
> +

%(rest) too?

> +set_transport_variables "server"
> +
> +test_expect_success 'batch-command remote-object-info file://' '
> +       (
> +               cd server &&
> +
> +               echo "$hello_sha1 $hello_size" >expect &&
> +               echo "$tree_sha1 $tree_size" >>expect &&
> +               echo "$commit_sha1 $commit_size" >>expect &&
> +               echo "$tag_sha1 $tag_size" >>expect &&
> +               git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
> +               remote-object-info "file://$(pwd)" $hello_sha1
> +               remote-object-info "file://$(pwd)" $tree_sha1
> +               remote-object-info "file://$(pwd)" $commit_sha1
> +               remote-object-info "file://$(pwd)" $tag_sha1
> +               EOF
> +               test_cmp expect actual
> +       )

Can we support <rev> instead of only <oid> here?

$ git cat-file --batch-check
HEAD
28583b8d8ca72730d7c9e0ea50861ad431a6dea4 commit 3038
master
ab336e8f1c8009c8b1aab8deb592148e69217085 commit 281
origin/master
23b219f8e3f2adfb0441e135f0a880e6124f766c commit 282
origin/master:git.c
e5d62fa5a92e95e1ede041ebf913d841744c31f8 blob 28398

So cat-file --batch-check can support it.

$git cat-file --batch-commands
remote-object-info "file://$(pwd)" master:git.c

I guess it cannot work now, right?




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux