Re: [PATCH 2/2] diff.c: get rid of duplicate implementation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 25.10.2017 um 20:49 schrieb Stefan Beller:
> The implementations in diff.c to detect moved lines needs to compare
> strings and hash strings, which is implemented in that file, as well
> as in the xdiff library.
> 
> Remove the rather recent implementation in diff.c and rely on the well
> exercised code in the xdiff lib.
> 
> With this change the hash used for bucketing the strings for the moved
> line detection changes from FNV32 (that is provided via the hashmaps
> memhash) to DJB2 (which is used internally in xdiff).  Benchmarks found
> on the web[1] do not indicate that these hashes are different in
> performance for readable strings.
> 
> [1] https://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed

Awesome comparison!  It calls the variant used in libxdiff DJB2a (which
uses xor to mix in the new char) instead of DJB2 (which uses addition).

There's also https://www.strchr.com/hash_functions, which lists DJB2
as Bernstein.  Both functions rank somewhere in the middle of that list.

> Signed-off-by: Stefan Beller <sbeller@xxxxxxxxxx>
> ---
>   diff.c | 82 ++++--------------------------------------------------------------
>   1 file changed, 4 insertions(+), 78 deletions(-)

Very nice.

René



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux