Re: [PATCH 3/5] backfill: add --batch-size=<n> option

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 06, 2024 at 08:07:16PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <derrickstolee@xxxxxxxxxx>
> 
> Users may want to specify a minimum batch size for their needs. This is only
> a minimum: the path-walk API provides a list of OIDs that correspond to the
> same path, and thus it is optimal to allow delta compression across those
> objects in a single server request.

Okay, here you explicitly say that this is a minimum batch size, so this
is by design and with a proper reason. Good.

> We could consider limiting the request to have a maximum batch size in the
> future. For now, we let the path-walk API batches determine the
> boundaries.

Should we maybe rename `--batch-size` to `--min-batch-size` so that it
does not become awkward if we ever want to have a maximum batch size, as
well? Also helps to set expectations with the user.

[snip]
> Based on these experiments, a batch size of 50,000 was chosen as the
> default value.

Thanks for all the data, this is really helpful!

> diff --git a/Documentation/git-backfill.txt b/Documentation/git-backfill.txt
> index 0e10f066fef..9b0bae04e9d 100644
> --- a/Documentation/git-backfill.txt
> +++ b/Documentation/git-backfill.txt
> @@ -38,6 +38,14 @@ delta compression in the packfile sent by the server.
>  By default, `git backfill` downloads all blobs reachable from the `HEAD`
>  commit. This set can be restricted or expanded using various options.
>  
> +OPTIONS
> +-------
> +
> +--batch-size=<n>::
> +	Specify a minimum size for a batch of missing objects to request
> +	from the server. This size may be exceeded by the last set of
> +	blobs seen at a given path. Default batch size is 16,000.

This is stale: s/16,000/50,000/

> diff --git a/builtin/backfill.c b/builtin/backfill.c
> index e5f2000d5e0..127333daef8 100644
> --- a/builtin/backfill.c
> +++ b/builtin/backfill.c
> @@ -112,6 +112,8 @@ int cmd_backfill(int argc, const char **argv, const char *prefix,
> struct reposit
>                 .batch_size = 50000,
>         };
>         struct option options[] = {
> +               OPT_INTEGER(0, "batch-size", &ctx.batch_size,
> +                           N_("Minimun number of objects to request at a time")),

s/Minimun/Minimum

Patrick




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux