From: Benjamin Serebrin <serebrin@xxxxxxxxxx> If the number of virtio queue pairs is not equal to the number of VCPUs, the virtio guest driver doesn't assign any CPU affinity for the queue interrupts or the xps aggregation interrupt. (In contrast, the driver does assign both if the counts of VCPUs and queues are equal, which is a good default behavior.) Google Compute Engine currently provides 1 queue pair for every VCPU, but limits that at a maximum of 32 queue pairs. This code extends the driver's default interrupt affinity and transmit affinity settings for the case where there are mismatching queue and VCPU counts. Userspace affinity adjustment may always be needed to tune for a given workload. Tested: (on a 64-VCPU VM with debian 8, jessie-backports 4.9.2) Without the fix we see all queues affinitized to all CPUs: cd /proc/irq for i in `seq 24 92` ; do sudo grep ".*" $i/smp_affinity_list; done 0-63 [...] 0-63 and we see all TX queues' xps_cpus affinitzed to no cores: for i in `seq 0 31` ; do sudo grep ".*" tx-$i/xps_cpus; done 00000000,00000000 [...] 00000000,00000000 With the fix, we see each queue assigned to the a single core, and xps affinity set to 1 unique cpu per TX queue. 64 VCPU: cd /proc/irq for i in `seq 24 92` ; do sudo grep ".*" $i/smp_affinity_list; done 0-63 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 16 16 17 17 18 18 19 19 20 20 21 21 22 22 23 23 24 24 25 25 26 26 27 27 28 28 29 29 30 30 31 31 0-63 0-63 0-63 0-63 cd /sys/class/net/eth0/queues for i in `seq 0 31` ; do sudo grep ".*" tx-$i/xps_cpus; done 00000001,00000001 00000002,00000002 00000004,00000004 00000008,00000008 00000010,00000010 00000020,00000020 00000040,00000040 00000080,00000080 00000100,00000100 00000200,00000200 00000400,00000400 00000800,00000800 00001000,00001000 00002000,00002000 00004000,00004000 00008000,00008000 00010000,00010000 00020000,00020000 00040000,00040000 00080000,00080000 00100000,00100000 00200000,00200000 00400000,00400000 00800000,00800000 01000000,01000000 02000000,02000000 04000000,04000000 08000000,08000000 10000000,10000000 20000000,20000000 40000000,40000000 80000000,80000000 48 VCPU: cd /proc/irq for i in `seq 24 92` ; do sudo grep ".*" $i/smp_affinity_list; done 0-47 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 16 16 17 17 18 18 19 19 20 20 21 21 22 22 23 23 24 24 25 25 26 26 27 27 28 28 29 29 30 30 31 31 0-47 0-47 0-47 0-47 cd /sys/class/net/eth0/queues for i in `seq 0 31` ; do sudo grep ".*" tx-$i/xps_cpus; done 0001,00000001 0002,00000002 0004,00000004 0008,00000008 0010,00000010 0020,00000020 0040,00000040 0080,00000080 0100,00000100 0200,00000200 0400,00000400 0800,00000800 1000,00001000 2000,00002000 4000,00004000 8000,00008000 0000,00010000 0000,00020000 0000,00040000 0000,00080000 0000,00100000 0000,00200000 0000,00400000 0000,00800000 0000,01000000 0000,02000000 0000,04000000 0000,08000000 0000,10000000 0000,20000000 0000,40000000 0000,80000000 Acked-by: Willem de Bruijn <willemb@xxxxxxxxxx> Acked-by: Jim Mattson <jmattson@xxxxxxxxxx> Acked-by: Venkatesh Srinivas <venkateshs@xxxxxxxxxx> Signed-off-by: Ben Serebrin <serebrin@xxxxxxxxxx> --- drivers/net/virtio_net.c | 30 +++++++++++++++++++++++++++--- 1 file changed, 27 insertions(+), 3 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 765c2d6358da..0dc3a102bfc4 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -1502,20 +1502,44 @@ static void virtnet_set_affinity(struct virtnet_info *vi) * queue pairs, we let the queue pairs to be private to one cpu by * setting the affinity hint to eliminate the contention. */ - if (vi->curr_queue_pairs == 1 || - vi->max_queue_pairs != num_online_cpus()) { + if (vi->curr_queue_pairs == 1) { virtnet_clean_affinity(vi, -1); return; } + /* If there are more cpus than queues, then assign the queues' + * interrupts to the first cpus until we run out. + */ i = 0; for_each_online_cpu(cpu) { + if (i == vi->max_queue_pairs) + break; virtqueue_set_affinity(vi->rq[i].vq, cpu); virtqueue_set_affinity(vi->sq[i].vq, cpu); - netif_set_xps_queue(vi->dev, cpumask_of(cpu), i); i++; } + /* Stripe the XPS affinities across the online CPUs. + * Hyperthread pairs are typically assigned such that Linux's + * CPU X and X + (numcpus / 2) are hyperthread twins, so we cause + * hyperthread twins to share TX queues, in the case where there are + * more cpus than queues. + */ + for (i = 0; i < vi->max_queue_pairs; i++) { + struct cpumask mask; + int skip = i; + + cpumask_clear(&mask); + for_each_online_cpu(cpu) { + while (skip--) + cpu = cpumask_next(cpu, cpu_online_mask); + if (cpu < num_possible_cpus()) + cpumask_set_cpu(cpu, &mask); + skip = vi->max_queue_pairs - 1; + } + netif_set_xps_queue(vi->dev, &mask, i); + } + vi->affinity_hint_set = true; } -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html