On 11/7/2020 2:39 PM, Elijah Newren wrote: > On Sat, Nov 7, 2020 at 7:02 AM Derrick Stolee <stolee@xxxxxxxxx> wrote: >> >> On 11/7/20 1:06 AM, Elijah Newren wrote: >>> Hi Derrick, >>> >>> On Tue, Nov 3, 2020 at 8:36 AM Elijah Newren <newren@xxxxxxxxx> wrote: >>>> All that said, for testing either branch you just need to first set >>>> pull.twohead=ort in your git config (see >>>> https://lore.kernel.org/git/61217a83bd7ff0ce9016eb4df9ded4fdf29a506c.1604360734.git.gitgitgadget@xxxxxxxxx/), >>>> or, if running regression tests, set GIT_TEST_MERGE_ALGORITHM=ort. >>> >>> I probably also should have mentioned that merge-ort does not (yet?) >>> heed merge.renames configuration setting; it always detects renames. >>> I know you run with merge.renames=false, so you won't quite get an >>> apples-to-apples comparison. However, part of my point was I wanted >>> to make renames fast enough that they could be left turned on, even >>> for the large scale repos, so I'm very interested in your experience. >>> If you need an escape hatch, though, just put a "return 1" at the top >>> of detect_and_process_renames() to turn it off. >>> >>> Oh, and I went through and re-merged all the merge commits in the >>> linux kernel and found a bug in merge-ort while doing that (causing it >>> to die, not to merge badly). I'm kind of surprised that none of my >>> testcases triggered that failure earlier; if you're testing it out, >>> you might want to update to get the fix (commit 067e5c1a38, >>> "merge-ort: fix bug with cached_target_names not being initialized in >>> redos", 2020-11-06). >> >> I did manage to do some testing to see what happens with >> a large repo under a small sparse-checkout. And using >> trace2, I was able to see that your code is being exercised. >> Unfortunately, I didn't see any performance improvement, and >> that is likely due to needing to expand the index entirely >> when checking out the merge commit. >> >> Is there a command to construct a merge commit without >> actually checking it out? That would reduce the time spent >> expanding the index, which would allow your algorithm to >> really show its benefits! > > Wow, very interesting. I am working on a --remerge-diff option for > log, which implies -p and is similar to -c or --cc in that it makes > merge commits show a diff, but which in particular remerges the two > parent commits complete with conflict markers and such and then diffs > the merge commit against that intermediate remerge. That's a case > that constructs a merge commit without ever touching the index (or > working tree)...but there's no equivalent comparison point for > merge-recursive. So, it doesn't provide something to compare against > (and while the code can be used I don't actually have a --remerge-diff > option yet -- it just hardcodes the behavior on if wanted or not), so > I'm not sure if you'd be interested in it. If you are, let me know > though, and I'll send details. > > However, I'm really surprised here, because merge-recursive always > reads and writes the index too (the index is the basis for its whole > algorithm). In fact, merge-recursive always reads the index at least > *twice* (it unconditionally discards and re-reads the index), so you > must have some kind of specialized tweaking of merge-recursive if it > somehow avoids a full index read/write. In order to do an > apples-to-apples comparison, we'd need to make those same tweaks to > merge-ort, but I don't have a clue what kind of tweaks you've made > here. So, some investigation points: > > *1*. Could you give me the accumulated times from the trace2_regions > so we can verify where the time is spent? The 'summarize-perf' script > at the toplevel of the repo in my ort branch might be helpful for > this; just prefix any git command with that script and it accumulates > the trace2 region times and prints them out. For example, I could run > 'summarize-perf git merge --no-edit B^0' or 'summarize-perf test-tool > fast-rebase --onto HEAD ca76bea9 myfeature'. Here's an example: > > === BEGIN OUTPUT === > $ /home/newren/floss/git/summarize-perf test-tool fast-rebase --onto > HEAD 4703d9119972bf586d2cca76ec6438f819ffa30e hwmon-updates > Rebasing fd8bdb23b91876ac1e624337bb88dc1dcc21d67e... > Done. > Accumulated times: > 0.031 : <unmeasured> ( 3.2%) > 0.837 : 35 : label:incore_nonrecursive > 0.003 : <unmeasured> ( 0.4%) > 0.476 : 41 : ..label:collect_merge_info > 0.001 : <unmeasured> ( 0.2%) > 0.475 : 41 : ....label:traverse_trees > 0.298 : 41 : ..label:renames > 0.015 : <unmeasured> ( 5.1%) > 0.280 : 41 : ....label:regular renames > 0.036 : <unmeasured> (12.7%) > 0.244 : 6 : ......label:diffcore_rename > 0.001 : <unmeasured> ( 0.4%) > 0.078 : 6 : ........label:dir rename setup > 0.055 : 6 : ........label:basename matches > 0.051 : 6 : ........label:exact renames > 0.031 : 6 : ........label:write back to queue > 0.017 : 6 : ........label:setup > 0.009 : 6 : ........label:cull basename > 0.003 : 6 : ........label:cull exact > 0.002 : 35 : ....label:directory renames > 0.001 : 35 : ....label:process renames > 0.052 : 35 : ..label:process_entries > 0.001 : <unmeasured> ( 1.7%) > 0.033 : 35 : ....label:processing > 0.017 : 35 : ....label:process_entries setup > 0.001 : <unmeasured> ( 5.8%) > 0.008 : 35 : ......label:plist copy > 0.008 : 35 : ......label:plist sort > 0.000 : 35 : ......label:plist grow > 0.001 : 35 : ....label:finalize > 0.005 : 35 : ..label:merge_start > 0.001 : <unmeasured> (18.8%) > 0.004 : 34 : ....label:reset_maps > 0.000 : 35 : ....label:sanity checks > 0.000 : 1 : ....label:allocate/init > 0.003 : 6 : ..label:reset_maps > 0.035 : 1 : label:do_write_index > /home/newren/floss/linux-stable/.git/index.lock > 0.034 : 1 : label:checkout > 0.034 : <unmeasured> (99.9%) > 0.000 : 1 : ..label:Filtering content > 0.009 : 1 : label:do_read_index .git/index > 0.000 : 1 : label:write_auto_merge > 0.000 : 1 : label:record_unmerged > Estimated measurement overhead (.010 ms/region-measure * 679): > 0.006790000000000001 > Timing including forking: 0.960 (0.013 additional seconds) > === END OUTPUT === > This was a run that took just under 1s (and was a hot-cache case; I > had just done the same rebase before to warm the caches), and the > combination of index/working tree bits (everything at and after > do_write_index in the output) was 0.035+0.034+0.009+0+0=0.078 seconds, > corresponding to just over 8.1% of overall time. I'm curious where > that lands for your repository testcase; if the larger time ends up > somewhere under the indented label:incore_nonrecursive region, then > it's due to something other than index reading/updating/writing. > > *2*. If it really is due to index reading/updating/writing, then index > handling in merge-ort is confined to two functions: checkout() and > record_unmerged_index_entries(). Both functions aren't too long, and > neither one calls into any other function within merge-ort.c. > (Further, checkout() is a near copy of code from merge_working_tree() > in builtin/checkout.c, or at least a copy of that function from a year > or so ago.) As such, it's possible you can go in and make whatever > special tweaks you have for partial index reading/writing to those > functions. > > I'm curious to hear back more on this. I don't have a lot of time to dig into this right now, but here are the stats for my rebases and merges with and without your option. The first thing I notice for each is that there is a significant amount of "unmeasured" time at the beginning of each, and that could possibly be improved separately. First, try a rebase forward and backward. $ /_git/git/summarize-perf git rebase --onto to from test Successfully rebased and updated refs/heads/test. Accumulated times: 8.511 : <unmeasured> (74.9%) 1.331 : 1 : ......label:unpack_trees 0.200 : <unmeasured> (15.1%) 0.580 : 1 : ........label:traverse_trees 0.403 : 1 : ........label:clear_ce_flags/0x00000000_0x02000000 0.126 : 1 : ........label:check_updates 0.126 : <unmeasured> (100.0%) 0.000 : 1 : ..........label:Filtering content 0.021 : 1 : ........label:clear_ce_flags/0x00080000_0x42000000 0.000 : 1 : ........label:fully_valid 1.059 : 1 : ......label:do_write_index /_git/office/src/.git/index.lock 0.930 : <unmeasured> (87.9%) 0.128 : 1 : ........label:write/extension/cache_tree 0.455 : 2 : ......label:fully_valid 0.001 : 1 : ......label:traverse_trees 0.000 : 1 : ......label:check_updates Estimated measurement overhead (.010 ms/region-measure * 41): 0.00041000000000000005 Timing including forking: 11.382 (0.026 additional seconds) $ /_git/git/summarize-perf git rebase --onto from to test Successfully rebased and updated refs/heads/test. Accumulated times: 8.556 : <unmeasured> (75.2%) 1.315 : 1 : ......label:unpack_trees 0.197 : <unmeasured> (15.0%) 0.580 : 1 : ........label:traverse_trees 0.391 : 1 : ........label:clear_ce_flags/0x00000000_0x02000000 0.126 : 1 : ........label:check_updates 0.126 : <unmeasured> (100.0%) 0.000 : 1 : ..........label:Filtering content 0.021 : 1 : ........label:clear_ce_flags/0x00080000_0x42000000 0.000 : 1 : ........label:fully_valid 1.071 : 1 : ......label:do_write_index /_git/office/src/.git/index.lock 0.942 : <unmeasured> (88.0%) 0.129 : 1 : ........label:write/extension/cache_tree 0.431 : 2 : ......label:fully_valid 0.001 : 1 : ......label:traverse_trees 0.000 : 1 : ......label:check_updates Estimated measurement overhead (.010 ms/region-measure * 41): 0.00041000000000000005 Timing including forking: 11.399 (0.026 additional seconds) Then do the same with the ort strategy. $ /_git/git/summarize-perf git -c pull.twohead=ort rebase --onto to from test Successfully rebased and updated refs/heads/test. Accumulated times: 8.350 : <unmeasured> (73.2%) 1.403 : 1 : ....label:checkout 0.000 : <unmeasured> ( 0.0%) 1.403 : 1 : ......label:unpack_trees 0.312 : <unmeasured> (22.3%) 0.539 : 1 : ........label:traverse_trees 0.401 : 1 : ........label:clear_ce_flags/0x00000000_0x02000000 0.128 : 1 : ........label:check_updates 0.128 : <unmeasured> (100.0%) 0.000 : 1 : ..........label:Filtering content 0.021 : 1 : ........label:clear_ce_flags/0x00080000_0x42000000 0.000 : 1 : ........label:fully_valid 1.081 : 1 : ....label:do_write_index /_git/office/src/.git/index.lock 0.951 : <unmeasured> (88.1%) 0.129 : 1 : ......label:write/extension/cache_tree 0.432 : 2 : ....label:fully_valid 0.143 : 1 : ....label:do_read_index .git/index 0.019 : <unmeasured> (13.1%) 0.125 : 1 : label:read/extension/cache_tree 0.004 : 1 : ....label:incore_nonrecursive 0.001 : <unmeasured> (25.8%) 0.002 : 1 : ......label:process_entries 0.000 : <unmeasured> ( 2.6%) 0.001 : 1 : ........label:finalize 0.001 : 1 : ........label:process_entries setup 0.000 : <unmeasured> ( 6.7%) 0.001 : 1 : ..........label:plist sort 0.000 : 1 : ..........label:plist copy 0.000 : 1 : ..........label:plist grow 0.000 : 1 : ........label:processing 0.001 : 1 : ......label:collect_merge_info 0.000 : <unmeasured> (35.3%) 0.001 : 1 : ........label:traverse_trees 0.000 : 1 : ......label:merge_start 0.000 : <unmeasured> (42.3%) 0.000 : 1 : ........label:allocate/init 0.000 : 1 : ........label:sanity checks 0.000 : 1 : ......label:renames 0.001 : 1 : ....label:traverse_trees 0.000 : 1 : ....label:write_auto_merge 0.000 : 1 : ....label:check_updates 0.000 : 1 : ....label:record_unmerged Estimated measurement overhead (.010 ms/region-measure * 56): 0.0005600000000000001 Timing including forking: 11.442 (0.027 additional seconds) $ /_git/git/summarize-perf git -c pull.twohead=ort rebase --onto from to test Successfully rebased and updated refs/heads/test. Accumulated times: 8.337 : <unmeasured> (73.2%) 1.395 : 1 : ....label:checkout 0.000 : <unmeasured> ( 0.0%) 1.395 : 1 : ......label:unpack_trees 0.309 : <unmeasured> (22.1%) 0.537 : 1 : ........label:traverse_trees 0.403 : 1 : ........label:clear_ce_flags/0x00000000_0x02000000 0.124 : 1 : ........label:check_updates 0.124 : <unmeasured> (100.0%) 0.000 : 1 : ..........label:Filtering content 0.021 : 1 : ........label:clear_ce_flags/0x00080000_0x42000000 0.000 : 1 : ........label:fully_valid 1.084 : 1 : ....label:do_write_index /_git/office/src/.git/index.lock 0.955 : <unmeasured> (88.1%) 0.129 : 1 : ......label:write/extension/cache_tree 0.436 : 2 : ....label:fully_valid 0.137 : 1 : ....label:do_read_index .git/index 0.013 : <unmeasured> ( 9.3%) 0.125 : 1 : label:read/extension/cache_tree 0.004 : 1 : ....label:incore_nonrecursive 0.001 : <unmeasured> (24.5%) 0.002 : 1 : ......label:process_entries 0.000 : <unmeasured> ( 2.5%) 0.001 : 1 : ........label:finalize 0.001 : 1 : ........label:process_entries setup 0.000 : <unmeasured> ( 6.5%) 0.001 : 1 : ..........label:plist sort 0.000 : 1 : ..........label:plist copy 0.000 : 1 : ..........label:plist grow 0.000 : 1 : ........label:processing 0.001 : 1 : ......label:collect_merge_info 0.000 : <unmeasured> (26.5%) 0.001 : 1 : ........label:traverse_trees 0.000 : 1 : ......label:merge_start 0.000 : <unmeasured> (43.1%) 0.000 : 1 : ........label:allocate/init 0.000 : 1 : ........label:sanity checks 0.000 : 1 : ......label:renames 0.001 : 1 : ....label:traverse_trees 0.000 : 1 : ....label:write_auto_merge 0.000 : 1 : ....label:check_updates 0.000 : 1 : ....label:record_unmerged Estimated measurement overhead (.010 ms/region-measure * 56): 0.0005600000000000001 Timing including forking: 11.418 (0.024 additional seconds) And here are timings for a simple merge. Two files at root were changed in the commits I made, but there are also some larger changes from the commit history. These should all be seen as "this tree updated in one of the two, so take that tree". $ git reset --hard test2 && /_git/git/summarize-perf git merge test -m test Merge made by the 'recursive' strategy. Accumulated times: 2.647 : <unmeasured> (48.6%) 1.384 : 1 : ..label:unpack_trees 0.267 : <unmeasured> (19.3%) 0.582 : 1 : ....label:traverse_trees 0.391 : 1 : ....label:clear_ce_flags/0x00000000_0x02000000 0.124 : 1 : ....label:check_updates 0.124 : <unmeasured> (100.0%) 0.000 : 1 : ......label:Filtering content 0.021 : 1 : ....label:clear_ce_flags/0x00080000_0x42000000 0.000 : 1 : ....label:fully_valid 1.060 : 1 : ..label:do_write_index /_git/office/src/.git/index.lock 0.931 : <unmeasured> (87.9%) 0.128 : 1 : ....label:write/extension/cache_tree 0.226 : 1 : ..label:fully_valid 0.134 : 1 : ..label:do_read_index .git/index 0.008 : <unmeasured> ( 5.8%) 0.126 : 1 : label:read/extension/cache_tree 0.001 : 1 : ..label:traverse_trees 0.000 : 1 : ..label:check_updates 0.000 : 1 : ..label:setup 0.000 : 1 : ..label:write back to queue Estimated measurement overhead (.010 ms/region-measure * 20): 0.0002 Timing including forking: 5.466 (0.015 additional seconds) $ git reset --hard test2 && /_git/git/summarize-perf git -c pull.twohead=ort merge test -m test Merge made by the 'ort' strategy. Accumulated times: 2.531 : <unmeasured> (49.1%) 1.328 : 1 : ..label:checkout 0.000 : <unmeasured> ( 0.0%) 1.328 : 1 : ....label:unpack_trees 0.228 : <unmeasured> (17.2%) 0.566 : 1 : ......label:traverse_trees 0.388 : 1 : ......label:clear_ce_flags/0x00000000_0x02000000 0.125 : 1 : ......label:check_updates 0.125 : <unmeasured> (100.0%) 0.000 : 1 : ........label:Filtering content 0.021 : 1 : ......label:clear_ce_flags/0x00080000_0x42000000 0.000 : 1 : ......label:fully_valid 1.067 : 1 : ..label:do_write_index /_git/office/src/.git/index.lock 0.938 : <unmeasured> (87.9%) 0.129 : 1 : ....label:write/extension/cache_tree 0.230 : 1 : ..label:fully_valid 0.002 : 1 : ..label:incore_recursive 0.001 : <unmeasured> (22.3%) 0.001 : 1 : ....label:collect_merge_info 0.001 : <unmeasured> (60.2%) 0.000 : 1 : ......label:traverse_trees 0.001 : 1 : ....label:process_entries 0.000 : <unmeasured> ( 2.8%) 0.001 : 1 : ......label:finalize 0.000 : 1 : ......label:process_entries setup 0.000 : <unmeasured> ( 6.9%) 0.000 : 1 : ........label:plist sort 0.000 : 1 : ........label:plist copy 0.000 : 1 : ........label:plist grow 0.000 : 1 : ......label:processing 0.000 : 1 : ....label:merge_start 0.000 : <unmeasured> (50.0%) 0.000 : 1 : ......label:allocate/init 0.000 : 1 : ......label:sanity checks 0.000 : 1 : ....label:renames 0.001 : 1 : ..label:traverse_trees 0.000 : 1 : ..label:write_auto_merge 0.000 : 1 : ..label:check_updates 0.000 : 1 : ..label:setup 0.000 : 1 : ..label:display messages 0.000 : 1 : ..label:write back to queue 0.000 : 1 : ..label:record_unmerged Estimated measurement overhead (.010 ms/region-measure * 36): 0.00036 Timing including forking: 5.174 (0.015 additional seconds) Thanks, -Stolee