On 05/21/2018 08:10 PM, Derrick Stolee wrote: > [...] > In the Discussion section of the `git merge-base` docs [1], we have the > following: > > When the history involves criss-cross merges, there can be more than > one best common ancestor for two commits. For example, with this topology: > > ---1---o---A > \ / > X > / \ > ---2---o---o---B > > both 1 and 2 are merge-bases of A and B. Neither one is better than > the other (both are best merge bases). When the --all option is not > given, it is unspecified which best one is output. > > This means our official documentation mentions that we do not have a > concrete way to differentiate between these choices. This makes me think > that this change in behavior is not a bug, but it _is_ a change in > behavior. It's worth mentioning, but I don't think there is any value in > making sure `git merge-base` returns the same output. > > Does anyone disagree? Is this something we should solidify so we always > have a "definitive" merge-base? > [...] This may be beyond the scope of what you are working on, but there are significant advantages to selecting a "best" merge base from among the candidates. Long ago [1] I proposed that the "best" merge base is the merge base candidate that minimizes the number of non-merge commits that are in git rev-list $candidate..$branch that are already in master: git rev-list $master (assuming merging branch into master), which is equivalent to choosing the merge base that minimizes git rev-list --count $candidate..$branch In fact, this criterion is symmetric if you exchange branch ↔ master, which is a nice property, and indeed generalizes pretty simply to computing the merge base of more than two commits. In that email I also included some data showing that the "best" merge base almost always results in either the same or a shorter diff than the more or less arbitrary algorithm that we currently use. Sometimes the difference in diff length is dramatic. To me it feels like the best *deterministic* merge base would be based on the above criterion, maybe with first-parent reachability, commit times, and SHA-1s used (in that order) to break ties. I don't plan to work on the implementation of this idea myself (though we've long used a script-based implementation of this algorithm internally at GitHub). Michael [1] https://public-inbox.org/git/539A25BF.4060501@xxxxxxxxxxxx/ See the rest of the thread for more interesting discussion. [2] https://public-inbox.org/git/8a9b3f20-eed2-c59b-f7ea-3c68b3c30bf5@xxxxxxxxxxxx/ Higher in this thread, Junio proposes a different criterion.