+ improve-randomness-of-hash_long-on-64bit.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled

     Improve randomness of hash_long on 64bit

has been added to the -mm tree.  Its filename is

     improve-randomness-of-hash_long-on-64bit.patch

See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find
out what to do about this


From: NeilBrown <neilb@xxxxxxx>

I'm still bothered by the poor showing on hash_long on 64bit on number that
differ only in the 3rd or 4th byte (as IP addresses might on a
little-endian host).  I hacked around the problem in
net/sunrpc/svcauth_unix.c until this issue got resolved, but it never did. 
So I am pushing again.

The problem is that the bit-spares prime that is close to the golden ratio
isn't really close enough.

I propose 'fixing' it by using hash_u32 on the upper and lower halves of a
64bit value.  This is slightly less efficient, but most code would probably
be happy calling hash_u32 anyway.

More comments in the code.

Cc: William Lee Irwin III <wli@xxxxxxxxxxxxxx>
Signed-off-by: Neil Brown <neilb@xxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxx>
---

 include/linux/hash.h |   69 +++++++++++++++++++++++------------------
 1 files changed, 40 insertions(+), 29 deletions(-)

diff -puN include/linux/hash.h~improve-randomness-of-hash_long-on-64bit include/linux/hash.h
--- devel/include/linux/hash.h~improve-randomness-of-hash_long-on-64bit	2006-05-11 00:30:20.000000000 -0700
+++ devel-akpm/include/linux/hash.h	2006-05-11 00:30:20.000000000 -0700
@@ -12,45 +12,56 @@
  * These primes are chosen to be bit-sparse, that is operations on
  * them can use shifts and additions instead of multiplications for
  * machines where multiplications are slow.
+ *
+ * Unfortunately, the closeness to the golden ratio appears to be
+ * more important than the primality:  The bit-sparse 64bit number
+ * provides particularly poor hashing for values that differ in the
+ * third or fourth bytes only (the 0xffff0000 bits), though the 32bit
+ * prime seems to work reasonably well.
+ *
+ * We never actually need anything close to 64bits from this hash function.
+ * The most any caller in the kernel asked for (as at 2.6.16) is 14 bits (pid_hash).
+ * Thus limiting the return to 32bits is safe.  So we hash 64bit
+ * quantities with a double 32bit hash.
+ *
+ * And in many cases, we only really want hash_u32, even on a 64bit arch.
  */
-#if BITS_PER_LONG == 32
 /* 2^31 + 2^29 - 2^25 + 2^22 - 2^19 - 2^16 + 1 */
-#define GOLDEN_RATIO_PRIME 0x9e370001UL
-#elif BITS_PER_LONG == 64
+#define GOLDEN_RATIO_PRIME_32 0x9e370001UL
 /*  2^63 + 2^61 - 2^57 + 2^54 - 2^51 - 2^18 + 1 */
-#define GOLDEN_RATIO_PRIME 0x9e37fffffffc0001UL
-#else
-#error Define GOLDEN_RATIO_PRIME for your wordsize.
-#endif
+#define GOLDEN_RATIO_PRIME_64 0x9e37fffffffc0001UL
 
-static inline unsigned long hash_long(unsigned long val, unsigned int bits)
+static inline u32 hash_u32(u32 val, unsigned int bits)
 {
-	unsigned long hash = val;
+	u32 hash = val;
 
-#if BITS_PER_LONG == 64
-	/*  Sigh, gcc can't optimise this alone like it does for 32 bits. */
-	unsigned long n = hash;
-	n <<= 18;
-	hash -= n;
-	n <<= 33;
-	hash -= n;
-	n <<= 3;
-	hash += n;
-	n <<= 3;
-	hash -= n;
-	n <<= 4;
-	hash += n;
-	n <<= 2;
-	hash += n;
-#else
 	/* On some cpus multiply is faster, on others gcc will do shifts */
-	hash *= GOLDEN_RATIO_PRIME;
-#endif
+	hash *= GOLDEN_RATIO_PRIME_32;
 
 	/* High bits are more random, so use them. */
-	return hash >> (BITS_PER_LONG - bits);
+	return hash >> (32 - bits);
+}
+
+static inline u32 hash_u64(u64 val, unsigned int bits)
+{
+	u32 hi = val >> 32;
+	return hash_u32( hash_u32( (u32)val , 32) ^ hi ,
+			 bits);
+}
+
+#if BITS_PER_LONG == 32
+static inline unsigned long hash_long(unsigned long val, unsigned int bits)
+{
+	return (unsigned long)hash_u32( (u32)val, bits);
+}
+#endif
+#if BITS_PER_LONG == 64
+static inline unsigned long hash_long(unsigned long val, unsigned int bits)
+{
+	return (unsigned long)hash_u64( (u64)val, bits);
 }
-	
+#endif
+
 static inline unsigned long hash_ptr(void *ptr, unsigned int bits)
 {
 	return hash_long((unsigned long)ptr, bits);
_

Patches currently in -mm which might be from neilb@xxxxxxx are

origin.patch
fix-dcache-race-during-umount.patch
fix-dcache-race-during-umount-fix.patch
prune_one_dentry-tweaks.patch
remove-softlockup-from-invalidate_mapping_pages.patch
improve-randomness-of-hash_long-on-64bit.patch
kconfig-select-things-at-the-closest-tristate-instead-of-bool.patch
make-address_space_operations-invalidatepage-return-void-reiser4.patch
md-reformat-code-in-raid1_end_write_request-to-avoid-goto.patch
md-remove-arbitrary-limit-on-chunk-size.patch
md-remove-useless-ioctl-warning.patch
md-increase-the-delay-before-marking-metadata-clean-and-make-it-configurable.patch
md-merge-raid5-and-raid6-code.patch
md-remove-nuisance-message-at-shutdown.patch
md-allow-checkpoint-of-recovery-with-version-1-superblock.patch
md-allow-a-linear-array-to-have-drives-added-while-active.patch
md-support-stripe-offset-mode-in-raid10.patch
md-make-md_print_devices-static.patch
md-split-reshape-portion-of-raid5-sync_request-into-a-separate-function.patch
md-dm-reduce-stack-usage-with-stacked-block-devices.patch

-
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Kernel Newbies FAQ]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Photo]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux