Calvin Wan <calvinwan@xxxxxxxxxx> 于2022年5月3日周二 06:33写道: > > Since the `info` command in cat-file --batch-command prints object info > for a given object, it is natural to add another command in cat-file > --batch-command to print object info for a given object from a remote. > Add `remote-object-info` to cat-file --batch-command. > > While `info` takes object ids one at a time, this creates overhead when > making requests to a server so `remote-object-info` instead can take > multiple object ids at once. > > cat-file --batch-command is generally implemented in the following > manner: > > - Receive and parse input from user > - Call respective function attached to command > - Set batch mode state, get object info, print object info > > In --buffer mode, this changes to: > > - Receive and parse input from user > - Store respective function attached to command in a queue > - After flush, loop through commands in queue > - Call respective function attached to command > - Set batch mode state, get object info, print object info > > Notice how the getting and printing of object info is accomplished one > at a time. As described above, this creates a problem for making > requests to a server. Therefore, `remote-object-info` is implemented in > the following manner: > > - Receive and parse input from user > If command is `remote-object-info`: > - Get object info from remote > - Loop through object info > - Call respective function attached to `info` > - Set batch mode state, use passed in object info, print object > info > Else: > - Call respective function attached to command > - Parse input, get object info, print object info > > And finally for --buffer mode `remote-object-info`: > - Receive and parse input from user > - Store respective function attached to command in a queue > - After flush, loop through commands in queue: > If command is `remote-object-info`: > - Get object info from remote > - Loop through object info > - Call respective function attached to `info` > - Set batch mode state, use passed in object info, print > object info > Else: > - Call respective function attached to command > - Set batch mode state, get object info, print object info > > To summarize, `remote-object-info` gets object info from the remote and > then generates multiples `info` commands with the object info passed in. > It cannot be implemented similarly to `info` and `content` because of > the overhead with remote commenication. > > --- > Documentation/git-cat-file.txt | 16 +- > builtin/cat-file.c | 200 +++++++++++++++---- > t/t1006-cat-file.sh | 347 +++++++++++++++++++++++++++++++++ > 3 files changed, 522 insertions(+), 41 deletions(-) > > diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt > index 24a811f0ef..0d9e8e6214 100644 > --- a/Documentation/git-cat-file.txt > +++ b/Documentation/git-cat-file.txt > @@ -115,6 +115,10 @@ info <object>:: > Print object info for object reference `<object>`. This corresponds to the > output of `--batch-check`. > > +remote-object-info <remote> [<object>]:: > + Print object info for object references `<object>` at specified <remote>. > + This command may only be combined with `--buffer`. > + > flush:: > Used with `--buffer` to execute all preceding commands that were issued > since the beginning or since the last flush was issued. When `--buffer` > @@ -245,7 +249,8 @@ newline. The available atoms are: > The full hex representation of the object name. > > `objecttype`:: > - The type of the object (the same as `cat-file -t` reports). > + The type of the object (the same as `cat-file -t` reports). See > + `CAVEATS` below. > > `objectsize`:: > The size, in bytes, of the object (the same as `cat-file -s` > @@ -253,13 +258,14 @@ newline. The available atoms are: > > `objectsize:disk`:: > The size, in bytes, that the object takes up on disk. See the > - note about on-disk sizes in the `CAVEATS` section below. > + note about on-disk sizes in the `CAVEATS` section below. Not > + supported by `remote-object-info`. > > `deltabase`:: > If the object is stored as a delta on-disk, this expands to the > full hex representation of the delta base object name. > Otherwise, expands to the null OID (all zeroes). See `CAVEATS` > - below. > + below. Not supported by `remote-object-info`. > > `rest`:: > If this atom is used in the output string, input lines are split Why not forbidden %(rest) here too? > +static int get_remote_info(struct batch_options *opt, int argc, const char **argv) > +{ > + int retval = 0; > + size_t i; > + struct remote *remote = NULL; > + struct object_id oid; > + struct string_list object_info_options = STRING_LIST_INIT_NODUP; > + static struct transport *gtransport; > + char *format = DEFAULT_FORMAT; > + > + if (opt->format) > + format = xstrdup(opt->format); > + > + remote = remote_get(argv[0]); > + if (!remote) > + die(_("must supply valid remote when using --object-info")); > + oid_array_clear(&object_info_oids); > + for (i = 1; i < argc; i++) { > + if (get_oid(argv[i], &oid)) > + die(_("malformed object id '%s'"), argv[i]); > + oid_array_append(&object_info_oids, &oid); > + } > + > + gtransport = transport_get(remote, NULL); > + if (gtransport->smart_options) { > + size_t j; > + int include_size = 0, include_type = 0; > + > + remote_object_info = xcalloc(object_info_oids.nr, sizeof(struct object_info)); > + gtransport->smart_options->object_info = 1; > + gtransport->smart_options->object_info_oids = &object_info_oids; > + /** > + * 'size' is the only option currently supported. > + * Other options that are passed in the format will default to a > + * standard fetch request rather than object-info. > + */ > + if (strstr(format, "%(objectsize)")) { > + string_list_append(&object_info_options, "size"); > + include_size = 1; > + } > + if (strstr(format, "%(objecttype)")) { > + string_list_append(&object_info_options, "type"); > + include_type = 1; > + } > + if (strstr(format, "%(objectsize:disk)")) > + die(_("objectsize:disk is currently not supported with remote-object-info")); > + if (strstr(format, "%(deltabase)")) > + die(_("deltabase is currently not supported with remote-object-info")); %(rest) too? > + if (object_info_options.nr > 0) { > + gtransport->smart_options->object_info_options = &object_info_options; > + for (j = 0; j < object_info_oids.nr; j++) { > + if (include_size) > + remote_object_info[j].sizep = xcalloc(1, sizeof(long)); > + if (include_type) > + remote_object_info[j].typep = xcalloc(1, sizeof(enum object_type)); > + } > + gtransport->smart_options->object_info_data = &remote_object_info; > + retval = transport_fetch_refs(gtransport, NULL); > + } > + } else { > + retval = -1; > + } > + return retval; > +} Maybe such code style will be better? if (!gtransport->smart_options) { return -1; } ... return transport_fetch_refs(gtransport, NULL); > + > struct object_cb_data { > struct batch_options *opt; > struct expand_data *expand; > @@ -538,6 +611,7 @@ typedef void (*parse_cmd_fn_t)(struct batch_options *, const char *, > struct queued_cmd { > parse_cmd_fn_t fn; > char *line; > + const char *name; > }; > > static void parse_cmd_contents(struct batch_options *opt, > @@ -565,9 +639,49 @@ static const struct parse_cmd { > } commands[] = { > { "contents", parse_cmd_contents, 1}, > { "info", parse_cmd_info, 1}, > + { "remote-object-info", parse_cmd_info, 1}, > { "flush", NULL, 0}, > }; > > +static void parse_remote_info(struct batch_options *opt, > + char *line, > + struct strbuf *output, > + struct expand_data *data, > + const struct parse_cmd *p_cmd, > + struct queued_cmd *q_cmd) > +{ > + int count; > + size_t i; > + const char **argv; > + > + count = split_cmdline(line, &argv); > + if (get_remote_info(opt, count, argv)) > + goto cleanup; > + opt->use_remote_info = 1; > + data->skip_object_info = 1; > + data->mark_query = 0; > + for (i = 0; i < object_info_oids.nr; i++) { > + if (remote_object_info[i].sizep) > + data->size = *remote_object_info[i].sizep; > + if (remote_object_info[i].typep) > + data->type = *remote_object_info[i].typep; > + > + data->oid = object_info_oids.oid[i]; > + if (p_cmd) > + p_cmd->fn(opt, argv[i+1], output, data); > + else > + q_cmd->fn(opt, argv[i+1], output, data); > + } > + opt->use_remote_info = 0; > + data->skip_object_info = 0; > + data->mark_query = 1; > + > +cleanup: > + for (i = 0; i < object_info_oids.nr; i++) > + free_object_info_contents(&remote_object_info[i]); > + free(remote_object_info); > +} > + > static void dispatch_calls(struct batch_options *opt, > struct strbuf *output, > struct expand_data *data, > @@ -579,8 +693,12 @@ static void dispatch_calls(struct batch_options *opt, > if (!opt->buffer_output) > die(_("flush is only for --buffer mode")); > > - for (i = 0; i < nr; i++) > - cmd[i].fn(opt, cmd[i].line, output, data); > + for (i = 0; i < nr; i++) { > + if (!strcmp(cmd[i].name, "remote-object-info")) > + parse_remote_info(opt, cmd[i].line, output, data, NULL, &cmd[i]); > + else > + cmd[i].fn(opt, cmd[i].line, output, data); > + } > > fflush(stdout); > } > @@ -640,11 +758,17 @@ static void batch_objects_command(struct batch_options *opt, > dispatch_calls(opt, output, data, queued_cmd, nr); > free_cmds(queued_cmd, &nr); > } else if (!opt->buffer_output) { > - cmd->fn(opt, p, output, data); > + if (!strcmp(cmd->name, "remote-object-info")) { > + char *line = xstrdup_or_null(p); > + parse_remote_info(opt, line, output, data, cmd, NULL); memory leak: "line". > + } else { > + cmd->fn(opt, p, output, data); > + } > } else { > ALLOC_GROW(queued_cmd, nr + 1, alloc); > call.fn = cmd->fn; > call.line = xstrdup_or_null(p); > + call.name = cmd->name; > queued_cmd[nr++] = call; > } > } > + > +test_expect_success 'remote-object-info fails on unspported filter option (objectsize:disk)' ' > + ( > + cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" && > + > + test_must_fail git cat-file --batch-command="%(objectsize:disk)" 2>err <<-EOF && > + remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1 > + EOF > + test_i18ngrep "objectsize:disk is currently not supported with remote-object-info" err > + ) > +' > + > +test_expect_success 'remote-object-info fails on unspported filter option (deltabase)' ' > + ( > + cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" && > + > + test_must_fail git cat-file --batch-command="%(deltabase)" 2>err <<-EOF && > + remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1 > + EOF > + test_i18ngrep "deltabase is currently not supported with remote-object-info" err > + ) > +' > + %(rest) too? > +set_transport_variables "server" > + > +test_expect_success 'batch-command remote-object-info file://' ' > + ( > + cd server && > + > + echo "$hello_sha1 $hello_size" >expect && > + echo "$tree_sha1 $tree_size" >>expect && > + echo "$commit_sha1 $commit_size" >>expect && > + echo "$tag_sha1 $tag_size" >>expect && > + git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF && > + remote-object-info "file://$(pwd)" $hello_sha1 > + remote-object-info "file://$(pwd)" $tree_sha1 > + remote-object-info "file://$(pwd)" $commit_sha1 > + remote-object-info "file://$(pwd)" $tag_sha1 > + EOF > + test_cmp expect actual > + ) Can we support <rev> instead of only <oid> here? $ git cat-file --batch-check HEAD 28583b8d8ca72730d7c9e0ea50861ad431a6dea4 commit 3038 master ab336e8f1c8009c8b1aab8deb592148e69217085 commit 281 origin/master 23b219f8e3f2adfb0441e135f0a880e6124f766c commit 282 origin/master:git.c e5d62fa5a92e95e1ede041ebf913d841744c31f8 blob 28398 So cat-file --batch-check can support it. $git cat-file --batch-commands remote-object-info "file://$(pwd)" master:git.c I guess it cannot work now, right?