Elijah Newren <newren@xxxxxxxxx> writes: > On Thu, Dec 21, 2023 at 12:59 PM Sean Allred <allred.sean@xxxxxxxxx> wrote: >> Is there a current reason why pre-commit shouldn't be invoked during >> rebase, or is this just waiting for a reviewable patch? >> >> This was brought up before at [1] in 2015, but that thread so old at >> this point that it seemed prudent to double-check before investing time >> in a developing and testing a patch. >> >> [1]: https://lore.kernel.org/git/1m55i3m.1fum4zo1fpnhncM%25lists@xxxxxxxxxxxxxxxx/ > > I'm very opinionated here. I'm just one person, so definitely take > this with a grain of salt, but in my view... > > Personally, I think implementing any per-commit hook in rebase by > default is a mistake. It enforces a must-be-in-a-worktree-and-the- > worktree-must-be-updated-with-every-replayed-commit mindset, which I > find highly problematic[2], even if that's "what we always used to > do". > > [2] https://lore.kernel.org/git/20231124111044.3426007-1-christian.couder@xxxxxxxxx/ I'm not hip with what most pre-commit hooks do, but I'll point out that a hook like pre-commit assuming there is a worktree is the fault of the hook implementation, not of the infrastructure that invokes the hook. I imagine most folks on this list are aware that a worktree is not needed to create a commit and update a branch to point at it. FWIW, I would also find such a mindset to be highly problematic :-) I'll take a moment here to thank you, Christian, and everyone else in that effort for your interest in and work on git-replay; I've been trying to watch its activity on-list closely in the hopes that we can adopt it into our system once it's ready. > Because of that, I would prefer to see this at most be a command line > flag. However, we've already got a command line flag that while not > identical, is essentially equivalent: "--exec $MY_SCRIPT" (it's not > the same because it's a post-commit check, but you get notification of > any problematic commits, and an immediate stop of the rebase for you > to fix up the problematic commit; fixing up the commit shouldn't be > problematic since you are, after all, already rebasing). Indeed, and an --exec 'git hook run pre-commit || git reset --soft HEAD~' would probably get you farther. I can certainly see an argument for this, but from the perspective of designing a system for other developers to use, such a rebase would have to be triggered automatically (perhaps on pre-push). > I see Phillip already responded and suggested not running the > pre-commit hook with every commit, but only upon the first commit > after a "git rebase --continue". That seems far more reasonable to me > than running on every commit...though even that worries me in regards > to what assumptions that entails about what is present in the working > tree. It's worth noting the context here is to prevent developers from committing conflict markers, so this would actually be exactly sufficient. Invoking pre-commit at this time would also be consistent with the behaviors of prepare-commit-msg, commit-msg, and post-commit -- at least when I reword a commit during a rebase. However, post-commit is executed after each picked commit during a rebase, so pre-commit there would also be consistent. > (For example, what about folks with large repositories, so large that > a branch switch or full checkout is extremely costly, and which could > benefit from resolving conflicts in a separate sparse-checkout > worktree, potentially much more sparse than their main checkout? As it happens, a single checkout of our source runs upwards of 2GB, so I'm exactly in the population you're describing :-) The main reason we're moving to Git from SVN is that an SVN checkout can take upwards of an hour for us today -- even with some real shenanigans to make them go faster. On the Git side, we've also looked into (though I don't recall if we had much success with) narrowing the sparsity patterns to just the conflicts for conflict resolution workflows -- particularly when moving feature code between separate trunks. So I guess I'm also glad we weren't too far off in left field on that one! (As I recall, one of the main challenges we faced there was ensuring there was enough stuff 'still around' so that both binary and project references could resolve and folks could use that information to help resolve conflicts. Hopefully git-replay can be smart enough to allow some customization on that front. We found some success with feeding the list of conflicted files into some arbitrary logic that spat out the sparsity pattern to use.) > And what if people like that really fast rebase resolution (namely, > done in a separate very sparse checkout which also has the advantage > of not polluting your current working tree) so much that they use it > on smaller repositories as well? Can I not even experiment with this > idea because of the historical per-commit-at-least-as-full-as-main > -worktree-checkout assumptions we've baked into rebase?) I'd be interested in reading more about this baked-in assumption. Are these mostly laid out in replay-design-notes.txt[3]? > While at it, I should also mention that I'm not a fan of the broken > pre-rebase hook; its design ties us to doing a single branch at a > time. Maybe that hook is not quite as bad, though, since we already > broke that hook and no one seemed to care (in particular, when > --update-refs was implemented). But if no one seems to care about > broken hooks, I think the proper answer is to either get rid of the > hook or fix it. If I were to guess, this likely stems either from an inexact definition of the hook in documentation (ultimately resulting in incomplete tests) or folks incorrectly assuming what each hook should do based purely on its name. Which leads to an interesting point: pre-commit specifically states that it is invoked by git-commit -- not that it's invoked whenever a commit is created. So perhaps the correct thing to do here (if a hook is in fact needed) would be to define a new hook -- but I worry about doing that in the current state where there doesn't *seem* to be very rigid coordination of when client hooks are invoked in terms of plumbing rather than porcelain. > Anyway, as I mentioned, I'm quite opinionated here. To the point that > I deemed git-rebase in its current form to possibly be unfixable > (after first putting a lot of work into improving it over the past > years) and eventually introduced a new "git-replay" command which I > hope to make into a competent replacement for it. Given that I do > have another command to do my experiments, others on the list may > think it's fine to send rebase further down this route, but I was > hoping to avoid further drift so that there might be some way of > re-implementing rebase on top of my "git-replay" ideas/design. I appreciate your perspective; you've certainly thought a lot about this space -- and I definitely share your goal of consolidating implementations for obvious reasons. So I suppose that leaves me with four possible paths forward: 1. Pursue invoking pre-commit before each commit in `git rebase` (likely generic in the sequencer) to be consistent with post-commit. It sounds like this isn't a popular option, but I'm curious to folks' thoughts on the noted behavior of post-commit here. 2. Pursue invoking pre-commit on `git rebase --continue` (likely on any --continue in the sequencer). This has the benefit of using existing configuration on developer machines to purportedly 'do the right thing' when its likely humans are touching code during conflict resolution. It's worth noting this isn't the only reason you might --continue, though, since the naive interpretation of this approach completely ignores sequencer commands like 'break', though it could probably just do what commit-msg does. 3. Define and implement a new hook that is called whenever a new commit is about to be (or has been?) written. Such a hook could be specifically designed to discourage assuming there's a working copy, though we're kidding nobody by thinking it won't be used downstream with that assumption. At least we could be explicit about expectations, though. This is *probably* a lot more design work than this little paragraph lets on, but I've not personally watched the introduction of a new hook so I don't have context for what to expect. 4. Trigger a rebase --exec in our pre-push. This is certainly the least work in git.git (i.e., no work at all), but it comes with the distinct disadvantage of playing whiplash with the developer's focus. During conflict resolution, they're thinking about conflicts. When you're ready to push, its likely that you're no longer thinking about conflicts. Does the behavior of post-commit here change any minds? [3]: https://github.com/newren/git/blob/2a621020863c0b867293e020fec0954b43818789/replay-design-notes.txt#L162 -- Sean Allred