Re: [PATCH] diffcore-delta: avoid ignoring final 'line' of file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 11, 2024 at 3:00 PM Junio C Hamano <gitster@xxxxxxxxx> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
>
> > diff --git a/diffcore-delta.c b/diffcore-delta.c
> > index c30b56e983b..7136c3dd203 100644
> > --- a/diffcore-delta.c
> > +++ b/diffcore-delta.c
> > @@ -159,6 +159,10 @@ static struct spanhash_top *hash_chars(struct repository *r,
> >               n = 0;
> >               accum1 = accum2 = 0;
> >       }
> > +     if (n > 0) {
> > +             hashval = (accum1 + accum2 * 0x61) % HASHBASE;
> > +             hash = add_spanhash(hash, hashval, n);
> > +     }
>
> OK, so we were ignoring the final short bit that is not terminated
> with LF due to the "continue".  Nicely found.
>
> > diff --git a/t/t4001-diff-rename.sh b/t/t4001-diff-rename.sh
> > index 85be1367de6..29299acbce7 100755
> > --- a/t/t4001-diff-rename.sh
> > +++ b/t/t4001-diff-rename.sh
> > @@ -286,4 +286,23 @@ test_expect_success 'basename similarity vs best similarity' '
> >       test_cmp expected actual
> >  '
> >
> > +test_expect_success 'last line matters too' '
> > +     test_write_lines a 0 1 2 3 4 5 6 7 8 9 >nonewline &&
> > +     printf "git ignores final up to 63 characters if not newline terminated" >>nonewline &&
> > +     git add nonewline &&
> > +     git commit -m "original version of file with no final newline" &&
>
> I found it misleading that the file whose name is nonewline has
> bunch of LF including on its last line ;-).

Yeah, sorry.  It's been a while, but I think I started with a file
with only one line that had no LF, but then realized for inexact
rename detection to kick in I needed some other lines, at least one of
which didn't match...but I forgot to update the filename after adding
the other lines...

> > +     # Change ONLY the first character of the whole file
> > +     test_write_lines b 0 1 2 3 4 5 6 7 8 9 >nonewline &&
>
> Same here, but it is too much to bother rewriting it ...
>
>         {
>                 test_write_lines ...
>                 printf ...
>         } >incomplete
>
> ... like so ("incomplete" stands for "file ending with an incomplete line"),
> so I'll let it pass.
>
> > +     printf "git ignores final up to 63 characters if not newline terminated" >>nonewline &&
>
>
> > +     git add nonewline &&
> > +     git mv nonewline still-no-newline &&
> > +     git commit -a -m "rename nonewline -> still-no-newline" &&
> > +     git diff-tree -r -M01 --name-status HEAD^ HEAD >actual &&
> > +     cat >expected <<-\EOF &&
> > +     R097    nonewline       still-no-newline
>
> I am not very happy with the hardcoded 97.  You are already using
> the non-standard 10% threshold.  If the delta detection that
> forgets about the last line is so broken as your proposed log
> message noted, shouldn't you be able to construct a sample pair of
> preimage and postimage for which the broken version gives so low
> similarity to be judged not worth treating as a rename, while the
> fixed version gives reasonable similarity to be made into a rename,
> by the default threshold?  That way, the test only needs to see if
> we got a rename (with any similarity) or a delete and an add.

Oops, the threshold is entirely unnecessary here; not sure why I
didn't remember to take it out (originally used the threshold while
testing without the fix to just how low of a similarity git thought
these nearly identical files had).

Since you don't like the threshold, and since we don't seem to have a
summary format that reports on the rename without the percentage, I
guess I need to munge the output with sed:

      sed -e "s/^R[0-9]* /R /" actual >actual.munged &&

and then compare the expected output to that.  I'll send in a patch
doing so and fix up the filenames and drop the rename threshold while
at it.





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux