Re: Re: [PATCH 3/4] builtin/clone.c: recognize fetch.serverOption configuration

Patrick Steinhardt <ps@xxxxxx> · Thu, 5 Sep 2024 13:05:15 +0200

On Wed, Sep 04, 2024 at 03:49:28PM +0800, Xing Xin wrote:
> At 2024-09-03 18:09:45, "Patrick Steinhardt" <ps@xxxxxx> wrote:
> >On Mon, Sep 02, 2024 at 12:13:55PM +0000, Xing Xin via GitGitGadget wrote:
> >> diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
> >> index 8e925db7e9c..105645ed685 100644
> >> --- a/Documentation/git-clone.txt
> >> +++ b/Documentation/git-clone.txt
> >> @@ -149,6 +149,9 @@ objects from the source repository into a pack in the cloned repository.
> >>  	unknown ones, is server-specific.
> >>  	When multiple ++--server-option=++__<option>__ are given, they are all
> >>  	sent to the other side in the order listed on the command line.
> >> +	When no ++--server-option=++__<option>__ is given from the command
> >> +	line, the values of configuration variable `fetch.serverOption`
> >> +	are used instead.
> >>  
> >>  `-n`::
> >>  `--no-checkout`::
> >
> >I'm not a 100% sure, but I don't think that `fetch.*` configs typically
> >impact git-clone(1). So this here is a tad surprising to me.
> >
> >It makes me wonder whether it is actually sensible to implement this as
> >part of the `fetch` namespace in the first place. I'm not yet quite sure
> >where this whole series slots in, that is why one would want to set up
> >default server options in the first place. So what I'm wondering right
> >now is whether the server options are something that you want to apply
> >globally for all remotes, or whether you'd rather want to set them up
> >per remote.
> 
> Sorry for not explaining our use case clearly. We have several internal
> repositories configured with numerous CI tasks, each requiring code
> preparation (sometimes via clone, sometimes via fetch). These CI tasks
> are ususally triggered by post-receive hook, so the concurrent tasks are
> actually fetching the same copy of code.
> 
> On git server, we want to deploy a special pack-objects-hook to mitigate
> the performance impacts caused by these CI tasks (so the packfile
> produced by git-pack-objects can be reused).  Since not all clone/fetch
> operations can benefit from this caching mechanism (e.g. pulls from
> users' dev environment), we need the client to pass a special identifier
> to inform the server whether caching support should be enabled for that
> clone/fetch. Clearly, using server options is a good choice.
> 
> To achieve our design, we need to add two patch series to git:
> 
> 1. Support injecting server options to identify environments via
>    configuration, because adding the --server-option parameter would
>    require too many script modifications, making it difficult to deploy.
>    This is what this patch series does.
> 2. Git server should pass the received server options as environment
>    variables (similar to push options) to the pack-objects-hook.

When you talk about client, do you mean that the actual users will have
to configure this? That sounds somewhat unmaintainable on your side from
the surface. I guess I ain't got enough knowledge around the environment
you operate in though, so I probably shouldn't judge.

> >In the latter case I could see that it may make sense to instead make
> >this `remote.<name>.serverOption`. This would also remove the unclean
> 
> I named the new configuration `fetch.serverOption` mainly to follow the
> `push.pushOption` pattern.  Since which server options to support is
> actually server-specific, using `remote.<name>.serverOption` is a good
> idea for git-fetch. However, how should we design the configuration for
> git-ls-remote or git-clone, if we wanna provide all of them with a
> default list of server options to send?

As mentioned in another reply, I think that putting this into the remote
configuration "remote.*.serverOption" might be a better solution, as it
also brings you the ability to set this per remote by default.

Patrick