Re: [PATCH] virtio_ring: Fix the stale index in available ring

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/20/24 04:22, Will Deacon wrote:
On Tue, Mar 19, 2024 at 02:59:23PM +1000, Gavin Shan wrote:
On 3/19/24 02:59, Will Deacon wrote:
   drivers/virtio/virtio_ring.c | 12 +++++++++---
   1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 49299b1f9ec7..7d852811c912 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -687,9 +687,15 @@ static inline int virtqueue_add_split(struct virtqueue *_vq,
   	avail = vq->split.avail_idx_shadow & (vq->split.vring.num - 1);
   	vq->split.vring.avail->ring[avail] = cpu_to_virtio16(_vq->vdev, head);
-	/* Descriptors and available array need to be set before we expose the
-	 * new available array entries. */
-	virtio_wmb(vq->weak_barriers);
+	/*
+	 * Descriptors and available array need to be set before we expose
+	 * the new available array entries. virtio_wmb() should be enough
+	 * to ensuere the order theoretically. However, a stronger barrier
+	 * is needed by ARM64. Otherwise, the stale data can be observed
+	 * by the host (vhost). A stronger barrier should work for other
+	 * architectures, but performance loss is expected.
+	 */
+	virtio_mb(false);
   	vq->split.avail_idx_shadow++;
   	vq->split.vring.avail->idx = cpu_to_virtio16(_vq->vdev,
   						vq->split.avail_idx_shadow);

Replacing a DMB with a DSB is _very_ unlikely to be the correct solution
here, especially when ordering accesses to coherent memory.

In practice, either the larger timing different from the DSB or the fact
that you're going from a Store->Store barrier to a full barrier is what
makes things "work" for you. Have you tried, for example, a DMB SY
(e.g. via __smb_mb()).

We definitely shouldn't take changes like this without a proper
explanation of what is going on.


Thanks for your comments, Will.

Yes, DMB should work for us. However, it seems this instruction has issues on
NVidia's grace-hopper. It's hard for me to understand how DMB and DSB works
from hardware level. I agree it's not the solution to replace DMB with DSB
before we fully understand the root cause.

I tried the possible replacement like below. __smp_mb() can avoid the issue like
__mb() does. __ndelay(10) can avoid the issue, but __ndelay(9) doesn't.

static inline int virtqueue_add_split(struct virtqueue *_vq, ...)
{
     :
         /* Put entry in available array (but don't update avail->idx until they
          * do sync). */
         avail = vq->split.avail_idx_shadow & (vq->split.vring.num - 1);
         vq->split.vring.avail->ring[avail] = cpu_to_virtio16(_vq->vdev, head);

         /* Descriptors and available array need to be set before we expose the
          * new available array entries. */
         // Broken: virtio_wmb(vq->weak_barriers);
         // Broken: __dma_mb();
         // Work:   __mb();
         // Work:   __smp_mb();

It's pretty weird that __dma_mb() is "broken" but __smp_mb() "works". How
confident are you in that result?


Yes, __dma_mb() is even stronger than __smp_mb(). I retried the test, showing
that both __dma_mb() and __smp_mb() work for us. I had too many tests yesterday
and something may have been messed up.

Instruction         Hitting times in 10 tests
---------------------------------------------
__smp_wmb()         8
__smp_mb()          0
__dma_wmb()         7
__dma_mb()          0
__mb()              0
__wmb()             0

It's strange that __smp_mb() works, but __smp_wmb() fails. It seems we need a
read barrier here. I will try WRITE_ONCE() + __smp_wmb() as suggested by Michael
in another reply. Will update the result soon.

Thanks,
Gavin





[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux