Re: [PATCH 00/14] Introduce new `git replay` command

Felipe Contreras <felipe.contreras@xxxxxxxxx> · Fri, 14 Apr 2023 11:39:04 -0600

Christian Couder wrote:
> # Quick Overview (from Elijah)
> 
> `git replay`, at a basic level, can perhaps be thought of as a
> "default-to-dry-run rebase" -- meaning no updates to the working tree,
> or to the index, or to any references.

Interesting, I just ran into this problem trying to cleanup my personal git
branches.

Simply checking which branches can be cleanly rebased on top of master takes a
significant amount of time without any tricks, and using `git merge-tree` still
takes some time.

But the biggest offender is checking which patches have not yet been merged
into master, which takes 52 seconds on my machine which is by no means old.

> # Reasons for diverging from cherry-pick & rebase (from Elijah)
> 
> * Server side needs

I personally don't care about the server side, but...

>   * Both cherry-pick and rebase, via the sequencer, are heavily tied
>     to updating the working tree, index, some refs, and a lot of
>     control files with every commit replayed, and invoke a mess of
>     hooks[1] that might be hard to avoid for backward compatibility
>     reasons (at least, that's been brought up a few times on the
>     list).

This is important as an end user as well.

Since day 1 one of the important selling points of git was that operations that
could be done in milliseconds did take milliseconds.

If it can be done faster, why wouldn't I want it to be done faster?

> * Decapitate HEAD-centric assumptions

That's good, but not particularly important at the moment IMO.

> * Performance
> 
>   * jj is slaughtering us on rebase speed[2].  I would like us to become
>     competitive.  (I dropped a few comments in the link at [2] about why
>     git is currently so bad.)

Indeed.

>   * From [3], there was a simple 4-patch series in linux.git that took
>     53 seconds to rebase.

I did participate in that discussion, but Uwe Kleine-König never responded back.

In [1] he clearly noticed the problem was *before* attempting to apply any
patch. Other people mentioned the fork-point detection, but I don't think that
was the issue, my guess was that checking for the possibility of a fast-forward
was the issue.

The code was clearly doing the wrong thing for that case, but I believe it
should have been fixed by d42c9ffa0f (rebase: factor out branch_base
calculation, 2022-10-17).

It would be interesting to see if this issue can be reproduced somehow.

>     Switching to ort dropped it to 16 seconds.

No, it dropped to 16 seconds it for Elijah, not Uwe. Uwe (who had the real
repository) noticed a big reduction of around 70%, but the discrepancy of using
--onto versus not always remained.

>     While that sounds great, only 11 *milliseconds* were needed to do
>     the actual merges.  That means almost *all* the time (>99%) was
>     overhead!  Big offenders:
> 
>     * --reapply-cherry-picks should be the default
> 
>     * can_fast_forward() should be ripped out, and perhaps other extraneous
>       revision walks

Doesn't d42c9ffa0f (rebase: factor out branch_base calculation, 2022-10-17)
deal with that?

---

I think something like this is defeinitely needed, when I rewrote `git rebase`
to use `git cherry-pick` I noticed many areas of improvement, and I'm of the
opinion that `git rebase` should be rewritten from scratch.

But precisely because git focuses too much on backwards compatibility (and
often in the wrong areas), I think `git replay` should be thoroughly discussed
before accepting something we could quickly realize can be substantially
improved.

Cheers.

[1] https://lore.kernel.org/git/20210528214024.vw4huojcklrm6d27@xxxxxxxxxxxxxx/

-- 
Felipe Contreras