On Thu, 2016-01-14 at 17:53 +0000, One Thousand Gnomes wrote: > > These results for Toeplitz are not plausible. Given random input you > > cannot expect any hash function to produce such uniform results. I > > suspect either your input data is biased or how your applying the hash > > is. > > > > When I run 64 random IPv4 3-tuples through Toeplitz and Jenkins I get > > something more reasonable: > > IPv4 address patterns are not random. Nothing like it. A long long time > ago we did do a bunch of tuning for network hashes using big porn site > data sets. Random it was not. > I ran my tests with non random IPV4 addresses, as I had 2 hosts, one server, one client. (typical benchmark stuff) The only 'random' part was the ports, so maybe ~20 bits of entropy, considering how we allocate ports during connect() to a given destination to avoid port reuse. > It's probably hard to repeat that exercise now with geo specific routing, > and all the front end caches and redirectors on big sites but I'd > strongly suggest random input is not a good test, and also that you need > to worry more about hash attacks than perfect distributions. Anyway, the exercise is not to find a hash that exactly splits 128 flows into 16 buckets, according to the number of flows per bucket. Maybe only 4 flows are sending at 3Gbits, and others are sending at 100 kbits. There is no way the driver can predict the future. This is why we prefer to select a queue given the cpu sending the packet. This permits a natural shift based on actual load, and is the default on linux (see XPS in Documentation/networking/scaling.txt) Only this driver has a selection based on a flow 'hash'. _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel