Re: [PATCH] Documentation/diff-options: explain different diff algorithms

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 6, 2018 at 4:18 PM Jonathan Nieder <jrnieder@xxxxxxxxx> wrote:

> > +DIFF ALGORITHMS
> > +---------------
>
> Please add some introductory words about what the headings refer to.

ok.

>
> > +the shortest output.
>
> Trivia: the `minimal` variant of Myers doesn't guarantee shortest
> output, either: what it minimizes is the number of lines marked as
> added or removed.  If you want to minimize context lines too, then
> that would be a new variant. ;-)

... and take line length into account. ;-)

It minimizes the edit distance in terms of lines, i.e. in a context-less diff
we get the lowest number of lines possible.

> > +This algorithm finds the longest common substring and recursively
> > +diffs the content before and after the longest common substring.
>
> optional: may be worth a short aside in the text about the distinction
> between LCS and LCS. ;-)
>
> It would be especially useful here, since the alphabet used in these
> strings is *lines* instead of characters, so the first-time reader
> could probably use some help in building their intuition.

That makes sense.

>
> > +This is often the fastest, but in corner cases (when there are
> > +many common substrings of the same length) it produces bad
>
> Can you clarify what "bad" means?  E.g. would "unexpected", or "poorly
> aligned", match what you mean?

I'll just go with unexpected.

> > +results as seen in:
> > +
> > +     seq 1 100 >one
> > +     echo 99 > two
> > +     seq 1 2 98 >>two
> > +     git diff --no-index --histogram one two



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux