Re: Improvement: `git-maintenance` to allow configuring of remotes to fetch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 03, 2024 at 11:31:04AM +0530, Shubham Kanodia wrote:
> On Tue, Sep 3, 2024 at 10:48 AM Patrick Steinhardt <ps@xxxxxx> wrote:
> > On Mon, Sep 02, 2024 at 09:16:24PM +0530, Shubham Kanodia wrote:
> > > > Patrick Steinhardt <ps@xxxxxx> writes:
> > > >
> > > > > I'm not aware of any discussion around this...
> > > >
> > > > I do not think so, either.
> > > >
> > > > I agree that it makes as much sense to limit prefetches to a subset
> > > > of remotes as it makes sense to limit to certain hierarchies (e.g.
> > > > excluding refs/changes/ or even limiting to refs/heads/seen and
> > > > nothing else).
> > >
> > > I'm seeking advice on the configuration option structure for this
> > > feature. The typical config format for maintenance tasks seems to be
> > > as follows:
> > >
> > > `maintenance.<task-name>.<option>`
> > >
> > > A natural extension of this for the prefetch task could be:
> > >
> > > ```
> > > git config maintenance.prefetch.<remote-name>.refs refs/heads/master
> > > ```
> > >
> > > In this structure, the 'refs' value represents only the source part of
> > > a refspec, and both remote and refs can be configured.
> > > Specifying a full refspec isn't necessary since the --prefetch option
> > > may override the destination anyway.
> > >
> > > While I've successfully implemented this approach, I'm open to
> > > suggestions for alternative configuration options. My concerns are:
> > >
> > > 1. Most Git configurations are nested up to three levels deep, whereas
> > > this proposal introduces a fourth level.
> > > 2. This configuration appears in the config file as:
> > >
> > > ```
> > > [maintenance "prefetch.origin"]
> > >        refs = refs/heads/master
> > > ```
> > > which might look odd?
> >
> > Agreed, it does. To me, the most natural way to configure this would be
> > as part of the remotes themselves:
> >
> > ```
> > [remote "origin"]
> >     url = https://example.com/repo.git
> >     fetch = +refs/heads/*:refs/remotes/origin/*
> >     # Whether or not the prefetch task shall fatch this repository.
> >     # Defaults to `true`.
> >     prefetch = true
> >     # An arbitrary number of refspecs used by the prefetch task.
> >     # Overrides the fetch refspec if given, otherwise we fall back to
> >     # using the fetch refspec.
> >     prefetchRefspec = +refs/heads/main:refs/remotes/origin/main
> > ```
> >
> > The prefetch refspec would be rewritten by git-maintenance(1) such that
> > the destination part (the right-hand side of the refspec) is prefixed
> > with `refs/prefetch/`, same as the fetch refspec would be changed in
> > this way.
> >
> > An alternative would be to _not_ rewrite the prefetch refspec at all and
> > thus allow the user to prefetch into arbitrary hierarchies. But I'm a
> > bit worried that this might cause users to misconfigure prefetches by
> > accident, causing it to overwrite their usual set of refs.
> >
> > > Also, hopefully my mail is formatted better this time!
> >
> > It is, thanks!
> >
> > Patrick
> 
> Interesting. I guess if we put this in `remote.*` instead of
> `maintenance.*` what's unclear then is if this setting should also be
> respected by `git fetch --prefetch`
> when used outside the context of a maintenance task — since that'd
> probably be a bigger change.

I would've expected it to already do, given that it is explicitly
documented in git-fetch(1) to be for git-maintenance(1). I don't have
enough knowledge around the prefetching task though to be able to tell
whether it should or shouldn't.

> For instance, the `skipFetchAll` remote config option seems to apply
> to prefetches (within maintenance & outside) and normal fetches.
> 
> Additionally, we'd need to discuss what to do with backward compatibility:
> 
> If we were designing maintenance prefetch right now, it'd probably
> make sense not to fetch it for any remote / refspec by default unless
> explicitly enabled since
> fetching all refs can be a waste of space in more cases than not.
> 
> However, since the current behaviour already fetches all remotes and
> refs, I don't know if breaking this is something we could do? If not,
> the behavior would be —
> 
> 1. If none of the remotes specify a `prefetch: true`, then prefetch
> all remotes and refs (backwards compat)
> 2. If at least one of the remotes specifies `prefetch: true` then only
> that remote should be fetched
> 3. Two-tier `fetch` / `prefetchRefSpec` hierarchy to decide which refs
> to fetch (we can decide on the name later as `fetch` and
> `prefetchRefSpec` seem asymmetrical)

I don't think we should change the current default. As mentioned in my
mail, `prefetch` should default to `true` such that the behaviour as we
have it right now is unchanged. Thus any remote that doesn't have it
gets fetched by default, and to disable fetching a specific remote you
would have to set `remote.<name>.prefetch = false`.

While the second item, namely change the default to `false` when there
is at least one `prefetch` config, might be something to consider. But
in the end I think it would lead to confusing behaviour, so I'd not go
there personally.

Patrick




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux