This is a continuation of Calvin Wan's (calvinwan@xxxxxxxxxx) patch series [PATCH v5 0/6] cat-file: add --batch-command remote-object-info command at [1]. Sometimes it is useful to get information about an object without having to download it completely. The server logic for retrieving size has already been implemented and merged in "a2ba162cda (object-info: support for retrieving object info, 2021-04-20)"[2]. This patch series implement the client option for it. This patch series add the `remote-object-info` command to `cat-file --batch-command`. This command allows the client to make an object-info command request to a server that supports protocol v2. If the server is v2, but does not have object-info capability, the entire object is fetched and the relevant object info is returned. A few questions open for discussions please: 1. In the current implementation, if a user puts `remote-object-info` in protocol v1, `cat-file --batch-command` will die. Which way do we prefer? "error and exit (i.e. die)" or "warn and wait for new command". 2. Right now, only the size is supported. If the batch command format contains objectsize:disk or deltabase, it will die. The question is about objecttype. In the current implementation, it will die too. But dying on objecttype breaks the default format. We have changed the default format to %(objectname) %(objectsize) when remote-object-info is used. Any suggestions on this approach? [1] https://lore.kernel.org/git/20220728230210.2952731-1-calvinwan@xxxxxxxxxx/#t [2] https://git.kernel.org/pub/scm/git/git.git/commit/?id=a2ba162cda2acc171c3e36acbbc854792b093cb7 V1 of the patch series can be found here: https://lore.kernel.org/git/20240628190503.67389-1-eric.peijian@xxxxxxxxx/ v2 of the patch series can be found here: https://lore.kernel.org/git/20240720034337.57125-1-eric.peijian@xxxxxxxxx/ Changes since V3 ================ - Fix typos and formatting errors - Add warning in the git-cat-file doc about the default format - Add a new test lib file, lib-cat-file.sh. And put the shared code of t1017-cat-file-remote-object-info.sh and t1006-cat-file.sh in it. Thank you. Eric Ju Calvin Wan (5): fetch-pack: refactor packet writing fetch-pack: move fetch initialization serve: advertise object-info feature transport: add client support for object-info cat-file: add remote-object-info to batch-command Eric Ju (1): cat-file: add declaration of variable i inside its for loop Documentation/git-cat-file.txt | 24 +- builtin/cat-file.c | 119 +++- fetch-pack.c | 49 +- fetch-pack.h | 10 + object-file.c | 11 + object-store-ll.h | 3 + serve.c | 4 +- t/lib-cat-file.sh | 16 + t/t1006-cat-file.sh | 13 +- t/t1017-cat-file-remote-object-info.sh | 739 +++++++++++++++++++++++++ transport-helper.c | 11 +- transport.c | 115 +++- transport.h | 11 + 13 files changed, 1081 insertions(+), 44 deletions(-) create mode 100644 t/lib-cat-file.sh create mode 100755 t/t1017-cat-file-remote-object-info.sh Range-diff against v3: 1: b570dee186 = 1: 41898fe23e fetch-pack: refactor packet writing 2: e8777e8776 = 2: b3a1bee551 fetch-pack: move fetch initialization 3: d00d19cf2c = 3: d363b0f768 serve: advertise object-info feature 4: 3e1773910c ! 4: 3118061b21 transport: add client support for object-info @@ transport-helper.c: static int fetch_refs(struct transport *transport, */ if (data->transport_options.acked_commits) { warning(_("--negotiate-only requires protocol v2")); - return -1; +@@ transport-helper.c: static int fetch_refs(struct transport *transport, + free_refs(dummy); } + /* fail the command explicitly to avoid further commands input. */ + if (transport->smart_options->object_info) + die(_("remote-object-info requires protocol v2")); + - if (!data->get_refs_list_called) - get_refs_list_using_list(transport, 0); - ++ if (!data->get_refs_list_called) ++ get_refs_list_using_list(transport, 0); ++ + count = 0; + for (i = 0; i < nr_heads; i++) + if (!(to_fetch[i]->status & REF_STATUS_UPTODATE)) ## transport.c ## @@ transport.c: static struct ref *handshake(struct transport *transport, int for_push, @@ transport.c: static struct ref *handshake(struct transport *transport, int for_p @@ transport.c: static int fetch_refs_via_pack(struct transport *transport, struct ref *refs = NULL; struct fetch_pack_args args; - struct ref *refs_tmp = NULL; + struct ref *refs_tmp = NULL, **to_fetch_dup = NULL; + struct ref *object_info_refs = NULL; memset(&args, 0, sizeof(args)); @@ transport.c: static int fetch_refs_via_pack(struct transport *transport, + ref_itr->exact_oid = 1; + if (i == transport->smart_options->object_info_oids->nr - 1) + /* last element, no need to allocate to next */ -+ ref_itr -> next = NULL; ++ ref_itr->next = NULL; + else + ref_itr->next = alloc_ref(""); @@ transport.c: static int fetch_refs_via_pack(struct transport *transport, close(data->fd[0]); if (data->fd[1] >= 0) close(data->fd[1]); - if (finish_connect(data->conn)) - ret = -1; - data->conn = NULL; -- - free_refs(refs_tmp); - free_refs(refs); - list_objects_filter_release(&args.filter_options); ## transport.h ## @@ 5: bb110fbc93 = 5: 2ae81acf2a cat-file: add declaration of variable i inside its for loop 6: 6dd143c164 ! 6: b5aa6c1888 cat-file: add remote-object-info to batch-command @@ Commit message - Get object info, print object info To summarize, `remote-object-info` gets object info from the remote and - then loop through the object info passed in, print the info. + then loop through the object info passed in, printing the info. In order for remote-object-info to avoid remote communication overhead in the non-buffer mode, the objects are passed in as such: @@ Documentation/git-cat-file.txt: newline. The available atoms are: -%(objecttype) %(objectsize)`. +%(objecttype) %(objectsize)`, except for `remote-object-info` commands which use +`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet. -+When "%(objecttype)" is supported, default format should be unified. ++WARNING: When "%(objecttype)" is supported, the default format WILL be unified, so ++DO NOT RELY on the current the default format to stay the same!!! If `--batch` is specified, or if `--batch-command` is used with the `contents` command, the object information is followed by the object contents (consisting @@ Documentation/git-cat-file.txt: scripting purposes. CAVEATS ------- -+Note that since %(objecttype), %(objectsize:disk) and %(deltabase) are currently not supported by the -+`remote-object-info` command, we will error and exit when they are in the format string. ++Note that since %(objecttype), %(objectsize:disk) and %(deltabase) are ++currently not supported by the `remote-object-info` command, we will error ++and exit when they are in the format string. + Note that the sizes of objects on disk are reported accurately, but care should be taken in drawing conclusions about which refs or objects are @@ object-store-ll.h: int for_each_object_in_pack(struct packed_git *p, + #endif /* OBJECT_STORE_LL_H */ - ## t/t1017-cat-file-remote-object-info.sh (new) ## + ## t/lib-cat-file.sh (new) ## @@ -+#!/bin/sh -+ -+test_description='git cat-file --batch-command with remote-object-info command' -+ -+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main -+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME -+ -+. ./test-lib.sh ++# Library of git-cat-file related functions. + ++# Print a string without a trailing newline +echo_without_newline () { -+ printf '%s' "$*" ++ printf '%s' "$*" +} + ++# Print a string without newlines and replaces them with a NULL character (\0). +echo_without_newline_nul () { -+ echo_without_newline "$@" | tr '\n' '\0' ++ echo_without_newline "$@" | tr '\n' '\0' +} + ++# Calculate the length of a string removing any leading spaces. +strlen () { -+ echo_without_newline "$1" | wc -c | sed -e 's/^ *//' ++ echo_without_newline "$1" | wc -c | sed -e 's/^ *//' +} + + ## t/t1006-cat-file.sh ## +@@ t/t1006-cat-file.sh: test_description='git cat-file' + + TEST_PASSES_SANITIZE_LEAK=true + . ./test-lib.sh ++. "$TEST_DIRECTORY"/lib-cat-file.sh + + test_cmdmode_usage () { + test_expect_code 129 "$@" 2>err && +@@ t/t1006-cat-file.sh: do + ' + done + +-echo_without_newline () { +- printf '%s' "$*" +-} +- +-echo_without_newline_nul () { +- echo_without_newline "$@" | tr '\n' '\0' +-} +- +-strlen () { +- echo_without_newline "$1" | wc -c | sed -e 's/^ *//' +-} +- + run_tests () { + type=$1 + oid=$2 + + ## t/t1017-cat-file-remote-object-info.sh (new) ## +@@ ++#!/bin/sh ++ ++test_description='git cat-file --batch-command with remote-object-info command' ++ ++GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main ++export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME ++ ++. ./test-lib.sh ++. "$TEST_DIRECTORY"/lib-cat-file.sh + +hello_content="Hello World" +hello_size=$(strlen "$hello_content") -- 2.47.0