> -----Original Message----- > From: John Fastabend [mailto:john.fastabend@xxxxxxxxx] > Sent: Thursday, January 7, 2016 5:02 PM > To: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>; Simon Xiao > <sixiao@xxxxxxxxxxxxx>; Eric Dumazet <eric.dumazet@xxxxxxxxx> > Cc: Tom Herbert <tom@xxxxxxxxxxxxxxx>; netdev@xxxxxxxxxxxxxxx; KY > Srinivasan <kys@xxxxxxxxxxxxx>; Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>; > devel@xxxxxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; David Miller > <davem@xxxxxxxxxxxxx> > Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct > flow_keys layout > > On 16-01-07 05:28 AM, Vitaly Kuznetsov wrote: > > Eric Dumazet <eric.dumazet@xxxxxxxxx> writes: > > > >> On Thu, 2016-01-07 at 10:33 +0100, Vitaly Kuznetsov wrote: > >>> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add > >>> VLAN ID to flow_keys")) introduced a performance regression in netvsc > >>> driver. Is problem is, however, not the above mentioned commit but the > >>> fact that netvsc_set_hash() function did some assumptions on the struct > >>> flow_keys data layout and this is wrong. We need to extract the data we > >>> need (src/dst addresses and ports) after the dissect. > >>> > >>> The issue could also be solved in a completely different way: as suggested > >>> by Eric instead of our own homegrown netvsc_set_hash() we could use > >>> skb_get_hash() which does more or less the same. Unfortunately, the > >>> testing done by Simon showed that Hyper-V hosts are not happy with our > >>> Jenkins hash, selecting the output queue with the current algorithm based > >>> on Toeplitz hash works significantly better. > >> > > Also can I ask the maybe naive question. It looks like the hypervisor > is populating some table via a mailbox msg and this is used to select > the queues I guess with some sort of weighting function? > > What happens if you just remove select_queue altogether? Or maybe just > what is this 16 entry table doing? How does this work on my larger > systems with 64+ cores can I only use 16 cores? Sorry I really have > no experience with hyperV and this got me curious. We will limit the number of VRSS channels to the number of CPUs in a NUMA node. If the number of CPUs in a NUMA node exceeds 8, we will only open up 8 VRSS channels. On the host side currently traffic spreading is done in software and we have found that limiting to 8 CPUs gives us the best throughput. In Windows Server 2016, we will be distributing traffic on the host in hardware; the heuristics in the guest may change. Regards, K. Y > > Thanks, > John > > >> Were tests done on IPv6 traffic ? > >> > > > > Simon, could you please test this patch for IPv6 and show us the numbers? > > > >> Toeplitz hash takes at least 100 ns to hash 12 bytes (one iteration per > >> bit : 96 iterations) > >> > >> For IPv6 it is 3 times this, since we have to hash 36 bytes. > >> > >> I do not see how it can compete with skb_get_hash() that directly gives > >> skb->hash for local TCP flows. > >> > > > > My guess is that this is not the bottleneck, something is happening > > behind the scene with out packets in Hyper-V host (e.g. re-distributing > > them to hardware queues?) but I don't know the internals, Microsoft > > folks could probably comment. > > > > > >> See commits b73c3d0e4f0e1961e15bec18720e48aabebe2109 > >> ("net: Save TX flow hash in sock and set in skbuf on xmit") > >> and 877d1f6291f8e391237e324be58479a3e3a7407c > >> ("net: Set sk_txhash from a random number") > >> > >> I understand Microsoft loves Toeplitz, but this looks not well placed > >> here. > >> > >> I suspect there is another problem. > >> > >> Please share your numbers and test methodology, and the alternative > >> patch Simon tested so that we can double check it. > >> > > > > Alternative patch which uses skb_get_hash() attached. Simon, could you > > please share the rest (environment, metodology, numbers) with us here? > > Thanks! > > > >> Thanks. > >> > >> PS: For the time being this patch can probably be applied on -net tree, > >> as it fixes a real bug. > > _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel