Re: [PATCH v6 6/6] blame: use a fingerprint heuristic to match ignored lines

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>  - I wonder if the hash used here can replace what is used in
>    diffcore-delta.c as an improvement (or obviously vice versa), as
>    using two (or more) ad-hoc fingerprinting function without having
>    a clear reason why we need two instead of a unified one feels
>    like a bad idea.

Hi Junio,
If I understand correctly, the algorithm in diffcore-delta.c is
intended to match files that contain identical lines (or 64-byte
chunks). The fingerprinting that Barret & I are talking about is
intended to match lines that contain identical byte pairs.
With significant refactoring, you could make the diffcore-delta
algorithm apply in both cases but I think the end result would be
longer and more complicated than keeping the two separate.
Unlike hashing a line, hashing a byte pair is trivial. Unlike hashing
lines, all except the first and last bytes are included in two
"hashes" - "hello" is hashed to "he", "el", "ll", "lo".
So based on my limited understanding of diffcore-delta.c I think the
two are algorithms are sufficiently different in intent and in
implementation that it's appropriate to keep them separate.

Regarding the "old heuristic" I think there may still be a use case
for that but I'll expand on that later.

Thanks,
-Michael



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux