Re: Is --minimal ever not the right thing?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



To add to what Mike said...

On Tue, Dec 19, 2023 at 9:25 AM Mike Castle <dalgoda@xxxxxxxxx> wrote:
>
> I believe that the diff algorithms available are the same one's in GNU
> diff.  From https://www.gnu.org/software/diffutils/manual/html_node/diff-Performance.html:
> """
> The way that GNU diff determines which lines have changed always comes
> up with a near-minimal set of differences. Usually it is good enough
> for practical purposes. If the diff output is large, you might want
> diff to use a modified algorithm that sometimes produces a smaller set
> of differences. The --minimal (-d) option does this; however, it can
> also cause diff to run more slowly than usual, so it is not the
> default behavior.
> """
>
> Since it has been that way decades before git even existed, I suspect
> (but do not know) that, yes, analysis has been performed, and it makes
> sense to keep the current default.
>
> Then again, in the decades sense, the entire stack from hardware to
> compilers has improved, and maybe it does deserve a revisit.  You
> could check whatever email archives is used for diffutils and see if
> there has been any discussion on it recently (say, last 5 years?).
>
> As you pointed out, you can set it yourself and see what happens over time.

There have been various discussions of diff performance, quality of
results, what the default should be, etc.  Including within the last
year.

minimal is guaranteed to produce a minimal diff, i.e. fewest total
subtractions and additions.  That is sometimes "best" quality, but
definitely not always.  On the performance axis, in special cases
minimal can be nearly as fast as myers and the other diff algorithms,
but only in special cases.

I think patience or histogram would make better defaults, at least
with some tweaks.  I had some patches to improve some worst case
performance and quality results coming from histogram that I was
working on in early 2023, but those got put on the backburner when
$DAYJOB pulled support for my Git work.  And I'm not aware of anyone
else currently working in the area.

Hope that helps,
Elijah





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux