Hi Phillip, On Thu, Feb 9, 2023 at 6:44 AM Phillip Wood <phillip.wood123@xxxxxxxxx> wrote: > > Hi Elijah > > On 09/02/2023 09:09, Elijah Newren wrote: > > Hi John and Phillip, > > > > On Tue, Feb 7, 2023 at 9:05 AM John Cai <johncai86@xxxxxxxxx> wrote: > >> > > [...] > >>> Perhaps I'm over simplifying but having read the issue you linked to I couldn't help feeling that the majority of users might be satisfied by just changing gitlab to use the patience algorithm when generating diffs. > >> > >> Right, I recognize this is a judgment call that may be best left up to the list. > >> > >> We don't have a way in GitLab to change the diff algorithm currently. Of course > >> that can be implemented outside of Git, > > > > Well, the below doesn't allow users to make diffs better for > > *individual* files of interest, but if you agree with me that we > > should just make diffs better for all users automatically, it's a > > two-line change in git.git that I'd love to eventually convince the > > project to take (though obviously doing that would also require some > > documentation changes and some good messaging in release notes and > > whatnot). I've used it for a good long while, and had a few dozen > > users using this patch too, all without complaint: > > I'd support a change to either patience or histogram as the default > algorithm. My personal preference would be for the patience algorithm as > I think it generally gives nicer diffs in the cases that the two > disagree (see below, I've tried changing diff.algorithm to histogram a > few times and I always end up changing it back to patience pretty > quickly). However I can see there is an advantage in having "diff" and > "merge" use the same algorithm as users who diffing either side to the > merge base will see the same diff that the merge is using. The histogram > algorithm is known to produce sub-optimal diffs in certain cases[1] but > I'm not sure how much worse it is in that respect than any of the other > algorithms. [...] > [1] > https://lore.kernel.org/git/CAGZ79kZYO6hHiAM8Sfp3J=VX11c=0-7YDSx3_EAKt5-uvvt-Ew@xxxxxxxxxxxxxx/ Thanks, I might have a fix, though I'm a bit worried my tweaks might trigger issues elsewhere or cost a bit of performance; I'll need to test. Are there any other good known testcases where histogram produces sub-optimal diffs? > To see the differences between the output of patience and histogram > algorithms I diffed the output of "git log -p --no-merges > --diff-algorithm=patience" and "git log -p --no-merges > --diff-algorithm=histogram". The first three differences are > > - 6c065f72b8 (http: support CURLOPT_PROTOCOLS_STR, 2023-01-16) > In get_curl_allowed_protocols() the patience algorithm shows the > change in the return statement more clearly > > - 47cfc9bd7d (attr: add flag `--source` to work with tree-ish, 2023-01-14) > The histogram algorithm shows read_attr_from_index() being moved > whereas the patience algorithm does not making the diff easier to > follow. > > - b0226007f0 (fsmonitor: eliminate call to deprecated FSEventStream > function, 2022-12-14) > In fsm_listen__stop_async() the histogram algorithm shows > data->shutdown_style = SHUTDOWN_EVENT; > being moved, which is not as clear as the patience output which > shows it as a context line. If my current changes are "good", then they also remove the differences between patience and histogram for the second and third commits above. (And the differences between the two algorithms for the first commit look really minor.) > I think there is a degree of personal preference when it comes to which > out of patience or histogram is best and the user can easily select > their preferred algorithm so I'd be happy with either. :-)