Re: Consider adding pruning of refs to git maintenance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 16, 2024 at 06:03:03PM +0530, Shubham Kanodia wrote:
> Remote-tracking refs accumulate quickly in large repositories as users
> merge and delete their branches. While these branches are cleaned up
> on the remote, local repositories may retain stale references to
> deleted branches unless explicitly pruned. The number of local refs
> can have an impact on git performance of several commands.
> 
> Git currently provides two ways for orphan local refs to be cleaned up —
> 1. Automated: `fetch.prune` and `fetch.pruneTags` configurations with
> `git fetch/pull`
> 2. Manual: `git remote prune`
> 
> However, both approaches have issues:
> - Full `git fetch/pull` operations are expensive on large
> repositories, pulling thousands of irrelevant refs
> - Manual `git remote prune` requires user intervention

Fair. Neither of those issues feel insurmountable, but I can see why it
could make our users lifes easier.

> Proposal:
> Add remote pruning to the daily `git-maintenance` task. This would
> clean stale refs automatically without requiring full fetches or
> manual intervention.
> 
> This is especially useful for users who historically pulled all
> refs/tags but now use targeted fetches. Moreover, it decouples the
> cleanup action (pruning) from the action to fetch more refs.

I think we need to consider a couple of things:

  - It's somewhat awkward to have maintenance jobs that interact with a
    remote, as that may not work in contexts where you actually need to
    authenticate. But there is precedent with the "prefetch" task, so we
    have already opened that can of worms.

  - Maintenance tries to be as non-destructive as reasonably possible,
    and deleting refs certainly is a destructive operation.

  - We try to avoid bad interactions with a user that works concurrently
    in the repo that git-maintenance(1) runs in. This is the reason why
    the "prefetch" task does not fetch into `refs/remotes`, but into a
    separate ref namespace.

If we want to have such a feature I'd thus claim that it would be most
sensible to make it opt-in rather than opt-out. I wouldn't want to be
surprised by remote refs vanishing after going to bed, but may be okay
with it when I explicitly ask for it.

At that point one has to raise the question whether it is still all that
useful compared to running `git remote prune` manually every now and
then. Mostly because explicitly configuring maintenance is probably
something that only power users would do, and those power users would
likely know to prune manually.

In any case, that's just my 2c. I can see a usecase for your feature,
but think we should be careful with how it is introduced.

> Happy to submit on a patch for the same unless there's something
> obvious that I've missed here.

I'm happy to have a look in case you decide to implement this feature.

Patrick




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux