kernel vhost demands an interrupt from guest when the ring is full in order to enable guest to submit new packets to the queue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Folks,

 

We came across a memory race condition between VPP vhost driver and the kernel vhost. VPP is running a tap interface over vhost backend. In this case, VPP is acting as the vhost driver mode and the kernel vhost is acting as the vhost device mode.

 

In the kernel vhost’s TX traffic direction which is VPP’s RX traffic direction, kernel vhost is the producer and VPP is the consumer. Kernel vhost places traffic in kernel vhost’s TX vring. VPP removes traffic in VPP’s RX vring. It is inevitable that the vring may become full under heavy traffic and the kernel vhost may have to stop and wait for the guest (VPP) to empty the vring and to refill the vring with descriptors. When that case happens, kernel vhost clears the bit in the vring’s used flag to request an interrupt/notification. Due to shared memory race condition, VPP may miss the clearing of the vring’s used flag from kernel vhost and didn’t send kernel vhost an interrupt after VPP empties and refills the vring with new descriptors. Unfortunately, this missed notification causes the kernel vhost to be stuck because once the kernel vhost is waiting for an interrupt/notification from the guest, only an interrupt/notification from the guest can resume/re-enable the guest from submitting new packets to the vring. This design seems vulnerable. Should the kernel vhost totally depend on the notification from the guest? Quoting the text from

 

http://docs.oasis-open.org/virtio/virtio/v1.0/virtio-v1.0.html

 

/* The device uses this in used->flags to advise the driver: don’t kick me 
 * when you add a buffer.  It’s unreliable, so it’s simply an 
 * optimization. */ 
#define VIRTQ_USED_F_NO_NOTIFY  1 

 

I interpret that the notification is simply an optimization, not a reliable notification mechanism. So the kernel vhost should not bet the farm on it.

 

We encounter the same race condition in kernel vhost’s RX traffic direction which causes the kernel vhost to be stuck due to missed interrupt/notification although it is less frequent.

 

Steven

_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux