On Tue, Sep 03, 2024 at 11:31:04AM +0530, Shubham Kanodia wrote: > On Tue, Sep 3, 2024 at 10:48 AM Patrick Steinhardt <ps@xxxxxx> wrote: > > On Mon, Sep 02, 2024 at 09:16:24PM +0530, Shubham Kanodia wrote: > > > > Patrick Steinhardt <ps@xxxxxx> writes: > > > > > > > > > I'm not aware of any discussion around this... > > > > > > > > I do not think so, either. > > > > > > > > I agree that it makes as much sense to limit prefetches to a subset > > > > of remotes as it makes sense to limit to certain hierarchies (e.g. > > > > excluding refs/changes/ or even limiting to refs/heads/seen and > > > > nothing else). > > > > > > I'm seeking advice on the configuration option structure for this > > > feature. The typical config format for maintenance tasks seems to be > > > as follows: > > > > > > `maintenance.<task-name>.<option>` > > > > > > A natural extension of this for the prefetch task could be: > > > > > > ``` > > > git config maintenance.prefetch.<remote-name>.refs refs/heads/master > > > ``` > > > > > > In this structure, the 'refs' value represents only the source part of > > > a refspec, and both remote and refs can be configured. > > > Specifying a full refspec isn't necessary since the --prefetch option > > > may override the destination anyway. > > > > > > While I've successfully implemented this approach, I'm open to > > > suggestions for alternative configuration options. My concerns are: > > > > > > 1. Most Git configurations are nested up to three levels deep, whereas > > > this proposal introduces a fourth level. > > > 2. This configuration appears in the config file as: > > > > > > ``` > > > [maintenance "prefetch.origin"] > > > refs = refs/heads/master > > > ``` > > > which might look odd? > > > > Agreed, it does. To me, the most natural way to configure this would be > > as part of the remotes themselves: > > > > ``` > > [remote "origin"] > > url = https://example.com/repo.git > > fetch = +refs/heads/*:refs/remotes/origin/* > > # Whether or not the prefetch task shall fatch this repository. > > # Defaults to `true`. > > prefetch = true > > # An arbitrary number of refspecs used by the prefetch task. > > # Overrides the fetch refspec if given, otherwise we fall back to > > # using the fetch refspec. > > prefetchRefspec = +refs/heads/main:refs/remotes/origin/main > > ``` > > > > The prefetch refspec would be rewritten by git-maintenance(1) such that > > the destination part (the right-hand side of the refspec) is prefixed > > with `refs/prefetch/`, same as the fetch refspec would be changed in > > this way. > > > > An alternative would be to _not_ rewrite the prefetch refspec at all and > > thus allow the user to prefetch into arbitrary hierarchies. But I'm a > > bit worried that this might cause users to misconfigure prefetches by > > accident, causing it to overwrite their usual set of refs. > > > > > Also, hopefully my mail is formatted better this time! > > > > It is, thanks! > > > > Patrick > > Interesting. I guess if we put this in `remote.*` instead of > `maintenance.*` what's unclear then is if this setting should also be > respected by `git fetch --prefetch` > when used outside the context of a maintenance task — since that'd > probably be a bigger change. I would've expected it to already do, given that it is explicitly documented in git-fetch(1) to be for git-maintenance(1). I don't have enough knowledge around the prefetching task though to be able to tell whether it should or shouldn't. > For instance, the `skipFetchAll` remote config option seems to apply > to prefetches (within maintenance & outside) and normal fetches. > > Additionally, we'd need to discuss what to do with backward compatibility: > > If we were designing maintenance prefetch right now, it'd probably > make sense not to fetch it for any remote / refspec by default unless > explicitly enabled since > fetching all refs can be a waste of space in more cases than not. > > However, since the current behaviour already fetches all remotes and > refs, I don't know if breaking this is something we could do? If not, > the behavior would be — > > 1. If none of the remotes specify a `prefetch: true`, then prefetch > all remotes and refs (backwards compat) > 2. If at least one of the remotes specifies `prefetch: true` then only > that remote should be fetched > 3. Two-tier `fetch` / `prefetchRefSpec` hierarchy to decide which refs > to fetch (we can decide on the name later as `fetch` and > `prefetchRefSpec` seem asymmetrical) I don't think we should change the current default. As mentioned in my mail, `prefetch` should default to `true` such that the behaviour as we have it right now is unchanged. Thus any remote that doesn't have it gets fetched by default, and to disable fetching a specific remote you would have to set `remote.<name>.prefetch = false`. While the second item, namely change the default to `false` when there is at least one `prefetch` config, might be something to consider. But in the end I think it would lead to confusing behaviour, so I'd not go there personally. Patrick