Hi Elijah, Le 27/12/2019 à 23:45, Elijah Newren a écrit : > Hi Alban, > > On Fri, Dec 27, 2019 at 1:11 PM Alban Gruin <alban.gruin@xxxxxxxxx> wrote: >> >> Hi Johannes & Elijah, >> >> Le 01/02/2019 à 07:04, Johannes Schindelin a écrit : >>> Hi Elijah, >>> >>> as discussed at the Contributors' Summit, I ran p3400 as-is (i.e. with the >>> --am backend) and then with --keep-empty to force the interactive backend >>> to be used. Here are the best of 10, on my relatively powerful Windows 10 >>> laptop, with current `master`. >>> >>> With regular rebase --am: >>> >>> 3400.2: rebase on top of a lot of unrelated changes 5.32(0.06+0.15) >>> 3400.4: rebase a lot of unrelated changes without split-index 33.08(0.04+0.18) >>> 3400.6: rebase a lot of unrelated changes with split-index 30.29(0.03+0.18) >>> >>> with --keep-empty to force the interactive backend: >>> >>> 3400.2: rebase on top of a lot of unrelated changes 3.92(0.03+0.18) >>> 3400.4: rebase a lot of unrelated changes without split-index 33.92(0.03+0.22) >>> 3400.6: rebase a lot of unrelated changes with split-index 38.82(0.03+0.16) >>> >>> I then changed it to -m to test the current scripted version, trying to >>> let it run overnight, but my laptop eventually went to sleep and the tests >>> were not even done. I'll let them continue and report back. >>> >>> My conclusion after seeing these numbers is: the interactive rebase is >>> really close to the performance of the --am backend. So to me, it makes a >>> total lot of sense to switch --merge over to it, and to make --merge the >>> default. We still should investigate why the split-index performance is so >>> significantly worse, though. >>> >>> Ciao, >>> Dscho >>> >> >> I investigated a bit on this. From a quick glance at a callgrind trace, >> I can see that ce_write_entry() is called 20 601[1] times with `git am', >> but 739 802 times with the sequencer when the split-index is enabled. > > Sweet, thanks for digging in and analyzing this. > >> For reference, here are the timings, measured on my Linux machine, on a >> tmpfs, with git.git as the repo: >> >> `rebase --am': >>> 3400.2: rebase on top of a lot of unrelated changes 0.29(0.24+0.03) >>> 3400.4: rebase a lot of unrelated changes without split-index 6.77(6.51+0.22) >>> 3400.6: rebase a lot of unrelated changes with split-index 4.43(4.29+0.13) >> `rebase --quiet': > > --quiet? Isn't that flag supposed to work with both backends and not > imply either one? We previously used --keep-empty, though there's a > chance that flag means we're not doing a fair comparison (since 'am' > will drop empty commits and thus have less work to do). Is there any > chance you actually ran a different command, but when you went to > summarize just typed the wrong flag name? Anyway, the best would > probably be to use --merge here (at the time Johannes and I were > testing, that wouldn't have triggered the sequencer, but it does now), > after first applying the en/rebase-backend series just to make sure > we're doing an apples to apples comparison. However, I suspect that > empty commits probably weren't much of a factor and you did find some > interesting things... > Yes, I did use `--keep-empty' but misremembered it when writing this email… >>> 3400.2: rebase on top of a lot of unrelated changes 0.24(0.21+0.02) >>> 3400.4: rebase a lot of unrelated changes without split-index 5.60(5.32+0.27) >>> 3400.6: rebase a lot of unrelated changes with split-index 5.67(5.40+0.26) >> >> This comes from two things: >> >> 1. There is not enough shared entries in the index with the sequencer. >> >> do_write_index() is called only by do_write_locked_index() with `--am', >> but is also called by write_shared_index() with the sequencer once for >> every other commit. As the latter is only called by >> write_locked_index(), which means that too_many_not_shared_entries() >> returns true for the sequencer, but never for `--am'. >> >> Removing the call to discard_index() in do_pick_commit() (as in the >> first attached patch) solve this particular issue, but this would >> require a more thorough analysis to see if it is actually safe to do. > > I'm actually surprised the sequencer would call discard_index(); I > would have thought it would have relied on merge_recursive() to do the > necessary index changes and updates other than writing the new index > out. But I'm not quite as familar with the sequencer so perhaps > there's some reason I'm unaware of. (Any chance this is a left-over > from when sequencer invoked external scripts to do the work, and thus > the index was updated in another processes' memory and on disk, and it > had to discard and re-read to get its own process updated?) > The sequencer re-reads the index after invoking an external command (either `git checkout', `git merge' or an `exec' command from the todo list), which makes sense. But this one seems to come from 6eb1b437933 ("cherry-pick/revert: make direct internal call to merge_tree()", 2008-09-02). So, yes, quite old, and perhaps no longer justified. I know I had to add another discard_cache() in rebase--interactive.c because it broke something with the submodules, but it does not seems all that useful now that rebase.c no longer has to fork to use the sequencer. Cheers, Alban