On Fri, Feb 10, 2023 at 5:59 PM Jeff King <peff@xxxxxxxx> wrote: > > On Thu, Feb 09, 2023 at 02:44:15PM +0000, Phillip Wood wrote: > > > To see the differences between the output of patience and histogram > > algorithms I diffed the output of "git log -p --no-merges > > --diff-algorithm=patience" and "git log -p --no-merges > > --diff-algorithm=histogram". The first three differences are > > > > - 6c065f72b8 (http: support CURLOPT_PROTOCOLS_STR, 2023-01-16) > > In get_curl_allowed_protocols() the patience algorithm shows the > > change in the return statement more clearly > > > > - 47cfc9bd7d (attr: add flag `--source` to work with tree-ish, 2023-01-14) > > The histogram algorithm shows read_attr_from_index() being moved > > whereas the patience algorithm does not making the diff easier to > > follow. > > > > - b0226007f0 (fsmonitor: eliminate call to deprecated FSEventStream > > function, 2022-12-14) > > In fsm_listen__stop_async() the histogram algorithm shows > > data->shutdown_style = SHUTDOWN_EVENT; > > being moved, which is not as clear as the patience output which > > shows it as a context line. > > Just a small counter-point, since I happened to be looking at myers vs > patience for something elsewhere in the thread, but: > > git show 35bd13fcd2caa4185bf3729655ca20b6a5fe9b6f builtin/add.c "fatal: bad object 35bd13fcd2caa4185bf3729655ca20b6a5fe9b6f" Is that a local commit of yours? > looks slightly better to me with myers, even though it is 2 lines > longer. The issue is that patience and histogram are very eager to use > blank lines as anchor points, so a diff like: > > -some words > - > -and some more > +unrelated content > + > +but it happens to also be two paragraphs > > in myers becomes: > > -some words > +unrelated content > > -and some more > +but it happens to also be two paragraphs > > in patience (here I'm using single lines, but in practice these may be > paragraphs, or stanzas of code). I think that's also the _strength_ of > patience in many cases, but it really depends on the content. Replacing > a multi-stanza block with another one may be the best explanation for > what happened. Or the two stanzas may be independent, and showing the > change for each one may be better. > > I'm not sure which one happens more often. And you'd probably want to > weight it by how good/bad the change is. In the example I showed I don't > find patience very much worse, since it's already a pretty ugly diff. > But in cases where patience shines, it may be making things > significantly more readable. > > I don't have a super strong opinion, but I just wanted to chime in that > it is not clear to me that patience/histogram is always a win over myers > (yes, I know your examples were comparing patience vs histogram, but the > larger thread is discussing the other). Oh, I agree histogram is not always a win over myers. I just feel it is the majority of the time. But if you want more than "feels", here's some solid data to back that up... I found a study on the subject over at https://link.springer.com/article/10.1007/s10664-019-09772-z. They were particularly interested in whether other academic studies could have been affected by git's different diff algorithms, and came away with the answer that it did. They looked at a few hundred thousand commits across two dozen different repositories and found (note that they only looked at myers and histogram, ignoring patience and minimal): * 92.4% - 98.6% of the diffs (depending on repo) are identical whether you use myers or histogram * 93.8% - 99.2% of the diffs (depending on repo) have the same number of added/deleted lines with myers and histogram * Of the >20k diffs that were not the identical, they selected a random sample of 377 diffs (taking care to make sure they were statistically representative) * They divided the 377 diffs into "code" and "non-code" diffs, i.e. those modifying source code and those modifying other textual files * They had two people annotating the diffs and independently scoring them, and then checked for agreement between their answers afterwards. (No, they didn't always agree, but they did have substantial agreement.) For the (again, non-identical) diffs modifying non-code, they found (see table 11) that: * 14.9% of the myers diffs are better * 13.4% of the histogram diffs are better * 71.6% of the diffs have equal quality For the (non-identical) diffs modifying code, they found (again, see table 11) that: * 16.9% of the myers diffs are better * 62.6% of the histogram diffs are better * 20.6% of the diffs have equal quality A ratio of 4 to 1 for histogram being better on code diffs is pretty weighty to me. It's possible these results would have been even better were it not for a couple of bugs in the histogram code (ported from the original in jgit). Phillip pointed me to a problematic testcase that Stefan Beller found, and in attempting to fix it (I'm on fix #4 or so), I believe I found another issue. However, I don't want to go into too much detail yet, as I found problems with some of my previous fixes and already invalidated things I told Phillip just last week.