Re: [PATCH] merge-file: warn for implicit 'myers' algorithm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/07/2024 08:06, Junio C Hamano wrote:
Everybody uses diff.c::diff_algorithm somehow, will be affected by
the diff.algorithm configuration variable, and everybody should
honor "--diff-algorithm=<choice>" command line option to override?

I have looked into writing a patch to implement this, but it looks like
the situation is quite messy. There are already half a dozen of commands
which claim to honour the diff.algorithm configuration variable, but
ignore it entirely.

In the documentation of the recursive merge strategy (for instance in
"man git-merge"), it is claimed that "recursive defaults to the
diff.algorithm config setting". As far as I can tell, both from my
reading of the code and my interactive testing, this is wrong. This
affects the "merge", "rebase" and "pull" commands, which all three
mention this configuration variable in their man page without respecting
it. Ouch!

I have looked for all commands which mention diff.algorithm in their man
page and checked whether they indeed respect it. The "diff-index",
"diff-tree" and "diff-files" commands also make this erroneous claim.
The --diff-algorithm CLI option (as well as the --histogram and
siblings) are respected, but not the diff.algorithm config variable.
Those inconsistencies seem to be caused by the inclusion of the
`diff-options.txt` file in the man pages, which leads their man pages to
documenting a bunch of config variables which are in fact ignored. From
a user perspective, I would say that those commands should indeed honour
diff.algorithm, so I would be tempted to fix the code rather than the
documentation. That being said, the man pages of those commands also
mention other options which are in fact ignored, such as
"diff.relative". I think it's likely that for some of those variables,
the contrary is desirable: remove them from the docs because they indeed
shouldn't be relied on by the command (because it does not make sense
for that particular command). I don't have (yet) a good enough
understanding of those commands and those options to judge this
immediately, but it looks like there is some cleaning up to do.

In diff.c, the code that is responsible for reading diff.* configuration
variables for commands like "log", "diff" or "diff-index" divides diff.*
variables into two categories: the ones that are "basic" (parsed by
"git_diff_basic_config") and the ones only relevant to the "ui" (parsed
by the "git_diff_ui_config" function). Surprisingly to me, the
"diff.algorithm" variable belongs to the "ui" category, and is therefore
skipped by commands such as "diff-index". It seems natural to me to move
diff.algorithm to the "basic" section. However, that alone will not fix
the problem (in my opinion much more serious) that the recursive merge
strategy ignores the variable. To fix this, the parsing of this variable
could either be added to "merge_recursive_config" (in
merge-recursive.c), or in fact directly to "git_xmerge_config" (in
xdiff-interface.c), which would then not only fix the recursive merge
strategy, but also a range of other commands such as "rerere" or
"merge-file". Given that I started looking into this with a specific
interest in "merge-file", I would obviously be tempted to fix this bug
directly at the xmerge level, and I think it would indeed make sense for
other affected commands (surely "rerere" should benefit from using merge
options that are consistent with the ones used for "git merge" or "git
rebase", no?), but I can imagine that it's too bold a step and could
have unwanted consequences I am not aware of - especially since Junio
recommended to add support for diff.algorithm at the diff.c level. What
do you think?

In any case, I would of course make sure the "ort" strategy continues to
ignore diff.algorithm for now, given its current default value.

If you want to try this out for yourself, I have set up a test
repository at https://git.kanthaus.online/antonin/testrepo. It has two
branches "master" and "theirs", which you can try to merge/rebase, for
instance "git merge -s recursive theirs" while being on master and
having "diff.algorithm" set to "histogram". The merge scenario is
designed to fail with a conflict if the myers algorithm is used, and
succeed if histogram is used. You should be able to see that "git merge
-s recursive theirs" will fail but "git merge -s recursive
-Xdiff-algorithm=histogram theirs" should work.

Best,

Antonin






[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux