Hi Jonathan and John
On 07/02/2022 23:34, Jonathan Tan wrote:
"John Cai via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
However, if we had --batch-command, we wouldn't need to keep both
processes around, and instead just have one --batch-command process
where we can flip between getting object info, and getting object
contents. Since we have a pair of cat-file processes per repository,
this means we can get rid of roughly half of long lived git cat-file
processes. Given there are many repositories being accessed at any given
time, this can lead to huge savings since on a given server.
One other benefit is that with explicit flushes, in a partial clone,
this makes it possible to batch prefetch objects.
Jonathan is there any overlap between what this series is trying to do
and your proposal for a batch command[1]? For example would extending
this series to get blob sizes be useful to you?
Best Wishes
Phillip
[1]
https://lore.kernel.org/git/20220207190320.2960362-1-jonathantanmy@xxxxxxxxxx/
diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index bef76f4dd06..618dbd15338 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -96,6 +96,25 @@ OPTIONS
need to specify the path, separated by whitespace. See the
section `BATCH OUTPUT` below for details.
+--batch-command::
+ Enter a command mode that reads commands and arguments from stdin.
+ May not be combined with any other options or arguments except
+ `--textconv` or `--filters`, in which case the input lines also need to
+ specify the path, separated by whitespace. See the section
+ `BATCH OUTPUT` below for details.
+
+contents <object>::
+ Print object contents for object reference <object>
+
+info <object>::
+ Print object info for object reference <object>
+
+flush::
+ Execute all preceding commands that were issued since the beginning or
+ since the last flush command was issued. Only used with --buffer. When
+ --buffer is not used, commands are flushed each time without issuing
+ `flush`.
The way this is formatted leads me to think that "contents", etc. are
CLI arguments, not things written to stdin. Some of the commit message
probably needs to go here.
I just looked at the commit message and documentation for now.
If you have time and are interested, we at Google are thinking of a more
comprehensive "batch" process [1].
[1] https://lore.kernel.org/git/20220207190320.2960362-1-jonathantanmy@xxxxxxxxxx/