On Tue, Mar 26, 2024 at 03:46:29PM +0000, Will Deacon wrote: > On Tue, Mar 26, 2024 at 11:43:13AM +0000, Will Deacon wrote: > > On Tue, Mar 26, 2024 at 09:38:55AM +0000, Keir Fraser wrote: > > > On Tue, Mar 26, 2024 at 03:49:02AM -0400, Michael S. Tsirkin wrote: > > > > > Secondly, the debugging code is enhanced so that the available head for > > > > > (last_avail_idx - 1) is read for twice and recorded. It means the available > > > > > head for one specific available index is read for twice. I do see the > > > > > available heads are different from the consecutive reads. More details > > > > > are shared as below. > > > > > > > > > > From the guest side > > > > > =================== > > > > > > > > > > virtio_net virtio0: output.0:id 86 is not a head! > > > > > head to be released: 047 062 112 > > > > > > > > > > avail_idx: > > > > > 000 49665 > > > > > 001 49666 <-- > > > > > : > > > > > 015 49664 > > > > > > > > what are these #s 49665 and so on? > > > > and how large is the ring? > > > > I am guessing 49664 is the index ring size is 16 and > > > > 49664 % 16 == 0 > > > > > > More than that, 49664 % 256 == 0 > > > > > > So again there seems to be an error in the vicinity of roll-over of > > > the idx low byte, as I observed in the earlier log. Surely this is > > > more than coincidence? > > > > Yeah, I'd still really like to see the disassembly for both sides of the > > protocol here. Gavin, is that something you're able to provide? Worst > > case, the host and guest vmlinux objects would be a starting point. > > > > Personally, I'd be fairly surprised if this was a hardware issue. > > Ok, long shot after eyeballing the vhost code, but does the diff below > help at all? It looks like vhost_vq_avail_empty() can advance the value > saved in 'vq->avail_idx' but without the read barrier, possibly confusing > vhost_get_vq_desc() in polling mode. > > Will > > --->8 > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c > index 045f666b4f12..87bff710331a 100644 > --- a/drivers/vhost/vhost.c > +++ b/drivers/vhost/vhost.c > @@ -2801,6 +2801,7 @@ bool vhost_vq_avail_empty(struct vhost_dev *dev, struct vhost_virtqueue *vq) > return false; > vq->avail_idx = vhost16_to_cpu(vq, avail_idx); > > + smp_rmb(); > return vq->avail_idx == vq->last_avail_idx; > } > EXPORT_SYMBOL_GPL(vhost_vq_avail_empty); Oh wow you are right. We have: if (vq->avail_idx == vq->last_avail_idx) { if (unlikely(vhost_get_avail_idx(vq, &avail_idx))) { vq_err(vq, "Failed to access avail idx at %p\n", &vq->avail->idx); return -EFAULT; } vq->avail_idx = vhost16_to_cpu(vq, avail_idx); if (unlikely((u16)(vq->avail_idx - last_avail_idx) > vq->num)) { vq_err(vq, "Guest moved used index from %u to %u", last_avail_idx, vq->avail_idx); return -EFAULT; } /* If there's nothing new since last we looked, return * invalid. */ if (vq->avail_idx == last_avail_idx) return vq->num; /* Only get avail ring entries after they have been * exposed by guest. */ smp_rmb(); } and so the rmb only happens if avail_idx is not advanced. Actually there is a bunch of code duplication where we assign to avail_idx, too. Will thanks a lot for looking into this! I kept looking into the virtio side for some reason, the fact that it did not trigger with qemu should have been a big hint! -- MST