On 06/07/2024 08:06, Junio C Hamano wrote:
Everybody uses diff.c::diff_algorithm somehow, will be affected by the diff.algorithm configuration variable, and everybody should honor "--diff-algorithm=<choice>" command line option to override?
I have looked into writing a patch to implement this, but it looks like the situation is quite messy. There are already half a dozen of commands which claim to honour the diff.algorithm configuration variable, but ignore it entirely. In the documentation of the recursive merge strategy (for instance in "man git-merge"), it is claimed that "recursive defaults to the diff.algorithm config setting". As far as I can tell, both from my reading of the code and my interactive testing, this is wrong. This affects the "merge", "rebase" and "pull" commands, which all three mention this configuration variable in their man page without respecting it. Ouch! I have looked for all commands which mention diff.algorithm in their man page and checked whether they indeed respect it. The "diff-index", "diff-tree" and "diff-files" commands also make this erroneous claim. The --diff-algorithm CLI option (as well as the --histogram and siblings) are respected, but not the diff.algorithm config variable. Those inconsistencies seem to be caused by the inclusion of the `diff-options.txt` file in the man pages, which leads their man pages to documenting a bunch of config variables which are in fact ignored. From a user perspective, I would say that those commands should indeed honour diff.algorithm, so I would be tempted to fix the code rather than the documentation. That being said, the man pages of those commands also mention other options which are in fact ignored, such as "diff.relative". I think it's likely that for some of those variables, the contrary is desirable: remove them from the docs because they indeed shouldn't be relied on by the command (because it does not make sense for that particular command). I don't have (yet) a good enough understanding of those commands and those options to judge this immediately, but it looks like there is some cleaning up to do. In diff.c, the code that is responsible for reading diff.* configuration variables for commands like "log", "diff" or "diff-index" divides diff.* variables into two categories: the ones that are "basic" (parsed by "git_diff_basic_config") and the ones only relevant to the "ui" (parsed by the "git_diff_ui_config" function). Surprisingly to me, the "diff.algorithm" variable belongs to the "ui" category, and is therefore skipped by commands such as "diff-index". It seems natural to me to move diff.algorithm to the "basic" section. However, that alone will not fix the problem (in my opinion much more serious) that the recursive merge strategy ignores the variable. To fix this, the parsing of this variable could either be added to "merge_recursive_config" (in merge-recursive.c), or in fact directly to "git_xmerge_config" (in xdiff-interface.c), which would then not only fix the recursive merge strategy, but also a range of other commands such as "rerere" or "merge-file". Given that I started looking into this with a specific interest in "merge-file", I would obviously be tempted to fix this bug directly at the xmerge level, and I think it would indeed make sense for other affected commands (surely "rerere" should benefit from using merge options that are consistent with the ones used for "git merge" or "git rebase", no?), but I can imagine that it's too bold a step and could have unwanted consequences I am not aware of - especially since Junio recommended to add support for diff.algorithm at the diff.c level. What do you think? In any case, I would of course make sure the "ort" strategy continues to ignore diff.algorithm for now, given its current default value. If you want to try this out for yourself, I have set up a test repository at https://git.kanthaus.online/antonin/testrepo. It has two branches "master" and "theirs", which you can try to merge/rebase, for instance "git merge -s recursive theirs" while being on master and having "diff.algorithm" set to "histogram". The merge scenario is designed to fail with a conflict if the myers algorithm is used, and succeed if histogram is used. You should be able to see that "git merge -s recursive theirs" will fail but "git merge -s recursive -Xdiff-algorithm=histogram theirs" should work. Best, Antonin