On Mon, Dec 23, 2024 at 3:26 PM Eric Ju <eric.peijian@xxxxxxxxx> wrote: > > Since the `info` command in cat-file --batch-command prints object info > for a given object, it is natural to add another command in cat-file > --batch-command to print object info for a given object from a remote. > > Add `remote-object-info` to cat-file --batch-command. > > While `info` takes object ids one at a time, this creates > overhead when making requests to a server.So `remote-object-info` > instead can take multiple object ids at once. > > cat-file --batch-command is generally implemented in the following > manner: > > - Receive and parse input from user > - Call respective function attached to command > - Get object info, print object info > > In --buffer mode, this changes to: > > - Receive and parse input from user > - Store respective function attached to command in a queue > - After flush, loop through commands in queue > - Call respective function attached to command > - Get object info, print object info > > Notice how the getting and printing of object info is accomplished one > at a time. As described above, this creates a problem for making > requests to a server. Therefore, `remote-object-info` is implemented in > the following manner: > > - Receive and parse input from user > If command is `remote-object-info`: > - Get object info from remote > - Loop through and print each object info > Else: > - Call respective function attached to command > - Parse input, get object info, print object info > > And finally for --buffer mode `remote-object-info`: > - Receive and parse input from user > - Store respective function attached to command in a queue > - After flush, loop through commands in queue: > If command is `remote-object-info`: > - Get object info from remote > - Loop through and print each object info > Else: > - Call respective function attached to command > - Get object info, print object info > > To summarize, `remote-object-info` gets object info from the remote and > then loop through the object info passed in, printing the info. > > In order for remote-object-info to avoid remote communication overhead > in the non-buffer mode, the objects are passed in as such: > > remote-object-info <remote> <oid> <oid> ... <oid> > > rather than > > remote-object-info <remote> <oid> > remote-object-info <remote> <oid> > ... > remote-object-info <remote> <oid> > > Helped-by: Jonathan Tan <jonathantanmy@xxxxxxxxxx> > Helped-by: Christian Couder <chriscool@xxxxxxxxxxxxx> > Signed-off-by: Calvin Wan <calvinwan@xxxxxxxxxx> > Signed-off-by: Eric Ju <eric.peijian@xxxxxxxxx> > --- > Documentation/git-cat-file.txt | 24 +- > builtin/cat-file.c | 99 ++++ > object-file.c | 11 + > object-store-ll.h | 3 + > t/lib-cat-file.sh | 16 + > t/t1006-cat-file.sh | 13 +- > t/t1017-cat-file-remote-object-info.sh | 652 +++++++++++++++++++++++++ > 7 files changed, 802 insertions(+), 16 deletions(-) > create mode 100644 t/lib-cat-file.sh > create mode 100755 t/t1017-cat-file-remote-object-info.sh > > diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt > index d5890ae368..6a2f9fd752 100644 > --- a/Documentation/git-cat-file.txt > +++ b/Documentation/git-cat-file.txt > @@ -149,6 +149,13 @@ info <object>:: > Print object info for object reference `<object>`. This corresponds to the > output of `--batch-check`. > > +remote-object-info <remote> <object>...:: > + Print object info for object references `<object>` at specified > + `<remote>` without downloading objects from the remote. > + Error when the `object-info` capability is not supported by the server. > + Error when no object references are provided. > + This command may be combined with `--buffer`. > + > flush:: > Used with `--buffer` to execute all preceding commands that were issued > since the beginning or since the last flush was issued. When `--buffer` > @@ -290,7 +297,8 @@ newline. The available atoms are: > The full hex representation of the object name. > > `objecttype`:: > - The type of the object (the same as `cat-file -t` reports). > + The type of the object (the same as `cat-file -t` reports). See > + `CAVEATS` below. Not supported by `remote-object-info`. > > `objectsize`:: > The size, in bytes, of the object (the same as `cat-file -s` > @@ -298,13 +306,14 @@ newline. The available atoms are: > > `objectsize:disk`:: > The size, in bytes, that the object takes up on disk. See the > - note about on-disk sizes in the `CAVEATS` section below. > + note about on-disk sizes in the `CAVEATS` section below. Not > + supported by `remote-object-info`. > > `deltabase`:: > If the object is stored as a delta on-disk, this expands to the > full hex representation of the delta base object name. > Otherwise, expands to the null OID (all zeroes). See `CAVEATS` > - below. > + below. Not supported by `remote-object-info`. > > `rest`:: > If this atom is used in the output string, input lines are split > @@ -314,7 +323,10 @@ newline. The available atoms are: > line) are output in place of the `%(rest)` atom. > > If no format is specified, the default format is `%(objectname) > -%(objecttype) %(objectsize)`. > +%(objecttype) %(objectsize)`, except for `remote-object-info` commands which use > +`%(objectname) %(objectsize)` for now because "%(objecttype)" is not supported yet. > +WARNING: When "%(objecttype)" is supported, the default format WILL be unified, so > +DO NOT RELY on the current the default format to stay the same!!! I remember this was one of my initial concerns when I first worked on this series -- without a use case for other fields, it's definitely hard to say how a default format for such would look and obviously when implemented, would cause the default format of %(objectsize) to change as well. I'm glad to see this outcome is well documented so we can have this feature working with a backdoor to change it if necessary for the future. Thanks again for your work on this series.