On Fri, Dec 06, 2024 at 08:07:16PM +0000, Derrick Stolee via GitGitGadget wrote: > From: Derrick Stolee <derrickstolee@xxxxxxxxxx> > > Users may want to specify a minimum batch size for their needs. This is only > a minimum: the path-walk API provides a list of OIDs that correspond to the > same path, and thus it is optimal to allow delta compression across those > objects in a single server request. Okay, here you explicitly say that this is a minimum batch size, so this is by design and with a proper reason. Good. > We could consider limiting the request to have a maximum batch size in the > future. For now, we let the path-walk API batches determine the > boundaries. Should we maybe rename `--batch-size` to `--min-batch-size` so that it does not become awkward if we ever want to have a maximum batch size, as well? Also helps to set expectations with the user. [snip] > Based on these experiments, a batch size of 50,000 was chosen as the > default value. Thanks for all the data, this is really helpful! > diff --git a/Documentation/git-backfill.txt b/Documentation/git-backfill.txt > index 0e10f066fef..9b0bae04e9d 100644 > --- a/Documentation/git-backfill.txt > +++ b/Documentation/git-backfill.txt > @@ -38,6 +38,14 @@ delta compression in the packfile sent by the server. > By default, `git backfill` downloads all blobs reachable from the `HEAD` > commit. This set can be restricted or expanded using various options. > > +OPTIONS > +------- > + > +--batch-size=<n>:: > + Specify a minimum size for a batch of missing objects to request > + from the server. This size may be exceeded by the last set of > + blobs seen at a given path. Default batch size is 16,000. This is stale: s/16,000/50,000/ > diff --git a/builtin/backfill.c b/builtin/backfill.c > index e5f2000d5e0..127333daef8 100644 > --- a/builtin/backfill.c > +++ b/builtin/backfill.c > @@ -112,6 +112,8 @@ int cmd_backfill(int argc, const char **argv, const char *prefix, > struct reposit > .batch_size = 50000, > }; > struct option options[] = { > + OPT_INTEGER(0, "batch-size", &ctx.batch_size, > + N_("Minimun number of objects to request at a time")), s/Minimun/Minimum Patrick