Re: [PATCH 1/4] exfat: Simplify exfat_utf8_d_hash() for code points above U+FFFF

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Monday 06 April 2020 09:37:38 Kohada.Tetsuhiro@xxxxxxxxxxxxxxxxxxxxxxxxxxx wrote:
> > > If you want to get an unbiased hash value by specifying an 8 or 16-bit
> > > value,
> > 
> > Hello! In exfat we have sequence of 21-bit values (not 8, not 16).
> 
> hash_32() generates a less-biased hash, even for 21-bit characters.
> 
> The hash of partial_name_hash() for the filename with the following character is ...
>  - 21-bit(surrogate pair): the upper 3-bits of hash tend to be 0.
>  - 16-bit(mostly CJKV): the upper 8-bits of hash tend to be 0.
>  - 8-bit(mostly latin): the upper 16-bits of hash tend to be 0.
> 
> I think the more frequently used latin/CJKV characters are more important
> when considering the hash efficiency of surrogate pair characters.
> 
> The hash of partial_name_hash() for 8/16-bit characters is also biased.
> However, it works well.
> 
> Surrogate pair characters are used less frequently, and the hash of 
> partial_name_hash() has less bias than for 8/16 bit characters.
> 
> So I think there is no problem with your patch.

So partial_name_hash() like I used it in this patch series is enough?



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux