Re: [PATCH V3 1/2] virtio-net: fix the set affinity bug when CPU IDs are not consecutive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2013-01-10 at 11:19 +1030, Rusty Russell wrote:
> Wanlong Gao <gaowanlong@xxxxxxxxxxxxxx> writes:
> > On 01/09/2013 07:31 AM, Rusty Russell wrote:
> >> Wanlong Gao <gaowanlong@xxxxxxxxxxxxxx> writes:
> >>>   */
> >>>  static u16 virtnet_select_queue(struct net_device *dev, struct sk_buff *skb)
> >>>  {
> >>> -	int txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) :
> >>> -		  smp_processor_id();
> >>> +	int txq = 0;
> >>> +
> >>> +	if (skb_rx_queue_recorded(skb))
> >>> +		txq = skb_get_rx_queue(skb);
> >>> +	else if ((txq = per_cpu(vq_index, smp_processor_id())) == -1)
> >>> +		txq = 0;
> >> 
> >> You should use __get_cpu_var() instead of smp_processor_id() here, ie:
> >> 
> >>         else if ((txq = __get_cpu_var(vq_index)) == -1)
> >> 
> >> And AFAICT, no reason to initialize txq to 0 to start with.
> >> 
> >> So:
> >> 
> >>         int txq;
> >> 
> >>         if (skb_rx_queue_recorded(skb))
> >> 		txq = skb_get_rx_queue(skb);
> >>         else {
> >>                 txq = __get_cpu_var(vq_index);
> >>                 if (txq == -1)
> >>                         txq = 0;
> >>         }
> >
> > Got it, thank you.
> >
> >> 
> >> Now, just to confirm, I assume this can happen even if we use vq_index,
> >> right, because of races with virtnet_set_channels?
> >
> > I still can't understand this race, could you explain more? thank you.
> 
> I assume that someone can call virtnet_set_channels() while we are
> inside virtnet_select_queue(), so they reduce dev->real_num_tx_queues,
> causing virtnet_set_channels to do:
> 
> 	while (unlikely(txq >= dev->real_num_tx_queues))
> 		txq -= dev->real_num_tx_queues;
> 
> Otherwise, when is this loop called?

In fact, this race can result in the TX scheduler using a queue that has
been disabled, or other weirdness (consider what happens if
real_num_tx_queues increases between those two uses).

virtnet_set_channels() really must disable TX temporarily:

	netif_tx_lock(dev);
	netif_device_detach(dev);
	netif_tx_unlock(dev);
	...
	netif_device_attach(dev);

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux