On 4/20/2018 1:34 PM, Elijah Newren wrote:
On Fri, Apr 20, 2018 at 6:36 AM, Ben Peart <Ben.Peart@xxxxxxxxxxxxx> wrote:
This enables the user to set a couple of additional options for merge.
1. merge.aggressive - this is to try to resolve a few more trivial
merge cases. It is documented in read-tree and is not something you
can pass into merge itself.
2. merge.renames - this is to save git from having to go through the entire
3 trees to see if there were any renames that happened.
For the work item repro that I have been using this drops the merge time
from ~1 hour to ~5 minutes and the unmerged entries goes down from
~40,000 to 1.
Ooh, this is *very* interesting. Is there any chance I could also get
you to test performing the same merge with the version of git at
https://github.com/newren/git/tree/big-repo-small-cherry-pick and
report on your timings?
Unfortunately, it isn't quite that simple. My repo is _really_ big
(3.2M files and ~100K commits per week) and requires me to use a custom
fork of git that works with our GVFS solution for it to work at all.
I've been watching your work in this area and am hoping it pays off for
us if/when we have users that want to do rename detection and override
our defaults.
The 'big-repo-small-cherry-pick' name could be improved, but that
branch has a number of performance fixes for really poor rename
detection performance during merges. From your description, I'm
pretty sure it'll apply to your case. For my specific testcase, I
got a speedup factor of 30. Someone else on the list saw a factor of
24[1]. Results are highly dependent on the specific repo, but it's
certainly possible that it gets much of your factor of 12 speedup that
you saw with these new config settings you added.
However, what makes this case even more interesting to me is that my
branch may not be quite as effective as your workarounds. There are
other other performance issues in merge that I am aware of, but for
which I haven't had the time to write the patches yet (I've been
waiting for the directory rename detection stuff to land and settle
down before working more on the performance aspects). I do not know
how big a factor those other performance issues are, but your
workarounds (namely the aggressive setting) may get around some of
those other issues as well, so I'm very interested to see how my
current branch compares to the speedups you got with these settings.
Thanks,
Elijah
[1] https://public-inbox.org/git/alpine.DEB.2.00.1711211303290.20686@xxxxxxxxxxxx/