On Mon, Jun 13, 2022 at 04:44:03PM +0800, 黄杰 wrote: > Michael S. Tsirkin <mst@xxxxxxxxxx> 于2022年6月12日周日 22:13写道: > > > > On Sun, Jun 12, 2022 at 07:02:25PM +0800, 黄杰 wrote: > > > Michael S. Tsirkin <mst@xxxxxxxxxx> 于2022年6月11日周六 08:35写道: > > > > > > > > On Sat, Jun 11, 2022 at 12:38:10AM +0800, 黄杰 wrote: > > > > > > This pattern was always iffy, but I don't think the patch > > > > > > improves the situation much. last_used_idx and vq->packed.used_wrap_counter > > > > > > can still get out of sync. > > > > > > > > > > Yes, You are absolutely correct, thanks for pointing out this issue, I > > > > > didn't take that into consideration, > > > > > how about disabling interrupts before this code below: > > > > > > > > > > > vq->last_used_idx += vq->packed.desc_state[id].num; > > > > > > if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) { > > > > > > vq->last_used_idx -= vq->packed.vring.num; > > > > > > vq->packed.used_wrap_counter ^= 1; > > > > > > } > > > > > > > > > > it seems to be fine to just turn off the interrupts of the current vring. > > > > > > > > > > BR > > > > > > > > That would make datapath significantly slower. > > > > > > > > > > > > > > Michael S. Tsirkin <mst@xxxxxxxxxx> 于2022年6月10日周五 22:50写道: > > > > > > > > > > > > On Fri, Jun 10, 2022 at 06:33:14PM +0800, huangjie.albert wrote: > > > > > > > ksoftirqd may consume the packet and it will call: > > > > > > > virtnet_poll > > > > > > > -->virtnet_receive > > > > > > > -->virtqueue_get_buf_ctx > > > > > > > -->virtqueue_get_buf_ctx_packed > > > > > > > and in virtqueue_get_buf_ctx_packed: > > > > > > > > > > > > > > vq->last_used_idx += vq->packed.desc_state[id].num; > > > > > > > if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) { > > > > > > > vq->last_used_idx -= vq->packed.vring.num; > > > > > > > vq->packed.used_wrap_counter ^= 1; > > > > > > > } > > > > > > > > > > > > > > if at the same time, there comes a vring interrupt,in vring_interrupt: > > > > > > > we will call: > > > > > > > vring_interrupt > > > > > > > -->more_used > > > > > > > -->more_used_packed > > > > > > > -->is_used_desc_packed > > > > > > > in is_used_desc_packed, the last_used_idx maybe >= vq->packed.vring.num. > > > > > > > so this could case a memory out of bounds bug. > > > > > > > > > > > > > > this patch is to fix this. > > > > > > > > > > > > > > Signed-off-by: huangjie.albert <huangjie.albert@xxxxxxxxxxxxx> > > > > > > > > > > > > > > > > > > This pattern was always iffy, but I don't think the patch > > > > > > improves the situation much. last_used_idx and vq->packed.used_wrap_counter > > > > > > can still get out of sync. > > > > > > > > > > > > Maybe refactor code to keep everything in vq->last_used_idx? > > > > > > > > > > > > Jason what is your take? > > > > > > > > > > > > > > > > > > > --- > > > > > > > drivers/virtio/virtio_ring.c | 3 +++ > > > > > > > 1 file changed, 3 insertions(+) > > > > > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > > > > > > index 13a7348cedff..d2abbb3a8187 100644 > > > > > > > --- a/drivers/virtio/virtio_ring.c > > > > > > > +++ b/drivers/virtio/virtio_ring.c > > > > > > > @@ -1397,6 +1397,9 @@ static inline bool is_used_desc_packed(const struct vring_virtqueue *vq, > > > > > > > bool avail, used; > > > > > > > u16 flags; > > > > > > > > > > > > > > + if (idx >= vq->packed.vring.num) > > > > > > > + return false; > > > > > > > + > > > > > > > flags = le16_to_cpu(vq->packed.vring.desc[idx].flags); > > > > > > > avail = !!(flags & (1 << VRING_PACKED_DESC_F_AVAIL)); > > > > > > > used = !!(flags & (1 << VRING_PACKED_DESC_F_USED)); > > > > > > > -- > > > > > > > 2.27.0 > > > > > > > > > > > > > > > > Michael S , thanks for your correction, there may be another simple > > > solution here: > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > > index 13a7348cedff..4db4db19f94a 100644 > > > --- a/drivers/virtio/virtio_ring.c > > > +++ b/drivers/virtio/virtio_ring.c > > > @@ -1397,6 +1397,9 @@ static inline bool is_used_desc_packed(const > > > struct vring_virtqueue *vq, > > > bool avail, used; > > > u16 flags; > > > > > > + if (idx >= vq->packed.vring.num) > > > + return false; > > > + > > > flags = le16_to_cpu(vq->packed.vring.desc[idx].flags); > > > avail = !!(flags & (1 << VRING_PACKED_DESC_F_AVAIL)); > > > used = !!(flags & (1 << VRING_PACKED_DESC_F_USED)); > > > @@ -1453,8 +1456,9 @@ static void *virtqueue_get_buf_ctx_packed(struct > > > virtqueue *_vq, > > > > > > vq->last_used_idx += vq->packed.desc_state[id].num; > > > if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) { > > > - vq->last_used_idx -= vq->packed.vring.num; > > > vq->packed.used_wrap_counter ^= 1; > > > + barrier(); > > > + vq->last_used_idx -= vq->packed.vring.num; > > > } > > > > > > /* > > > > > > vq->packed.used_wrap_counter and vq->last_used_idx only increased > > > by the virtqueue_get_buf_ctx_packed, and > > > so we can add a memory barrier and Changing the order in which > > > last_used_idx and used_wrap_counter are assigned > > > should temporarily solve the problem. But as you said, the code may > > > need to be refactored to fully address these kinds of issues. > > > > > > BR > > > > this might solve the OOB access but not the problem that interrupt > > might use an incorrect value to check for the used index. > > > > Yes, thanks for that, but it seems that it can not solve the problem > that interrupt > might use an incorrect value to check for the used index if we do not > disable irq in softirqd. > the following code in virtqueue_get_buf_ctx_packed: > > >1453 /* detach_buf_packed clears data, so grab it now. */ > >1454 ret = vq->packed.desc_state[id].data; > >1455 detach_buf_packed(vq, id, ctx); > >1456 > >1457 vq->last_used_idx += vq->packed.desc_state[id].num; > >1458 if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) { > >1459 vq->last_used_idx -= vq->packed.vring.num; > >1460 vq->packed.used_wrap_counter ^= 1; > >1461 } > > after call line 1455, the real last_used_idx should add > vq->packed.desc_state[id].num, but it > add it in line 1457. if the interrupt comes before 1457, we also get > the incorrect last_used_idx. > do you have any good comments? This problem exists even if > last_used_idx and used_wrap_counter are merged and their operations > are atomic. > > BR The assumption is that we do not need to worry about the case where ring already had used buffers: these are going to be polled by the driver. We do need to worry about any buffers added after driver drains the vq. In other words the job of the callback is to signal to driver the no used buffers -> some used buffers condition. This ignores virtqueue_enable_cb_delayed. _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization