On Tue, Mar 28, 2023 at 11:39:59AM +0800, Jason Wang wrote: > On Tue, Mar 28, 2023 at 11:09 AM 黄杰 <huangjie.albert@xxxxxxxxxxxxx> wrote: > > > > Jason Wang <jasowang@xxxxxxxxxx> 于2023年3月28日周二 10:59写道: > > > > > > On Tue, Mar 28, 2023 at 10:13 AM Dominique Martinet > > > <asmadeus@xxxxxxxxxxxxx> wrote: > > > > > > > > Hi Michael, Albert, > > > > > > > > Albert Huang wrote on Sat, Mar 25, 2023 at 06:56:33PM +0800: > > > > > in virtio_net, if we disable the napi_tx, when we triger a tx interrupt, > > > > > the vq->event_triggered will be set to true. It will no longer be set to > > > > > false. Unless we explicitly call virtqueue_enable_cb_delayed or > > > > > virtqueue_enable_cb_prepare. > > > > > > > > This patch (commited as 35395770f803 ("virtio_ring: don't update event > > > > idx on get_buf") in next-20230327 apparently breaks 9p, as reported by > > > > Luis in https://lkml.kernel.org/r/ZCI+7Wg5OclSlE8c@xxxxxxxxxxxxxxxxxxxxxx > > > > > > > > I've just hit had a look at recent patches[1] and reverted this to test > > > > and I can mount again, so I'm pretty sure this is the culprit, but I > > > > didn't look at the content at all yet so cannot advise further. > > > > It might very well be that we need some extra handling for 9p > > > > specifically that can be added separately if required. > > > > > > > > [1] git log 0ec57cfa721fbd36b4c4c0d9ccc5d78a78f7fa35..HEAD drivers/virtio/ > > > > > > > > > > > > This can be reproduced with a simple mount, run qemu with some -virtfs > > > > argument and `mount -t 9p -o debug=65535 tag mountpoint` will hang after > > > > these messages: > > > > 9pnet: -- p9_virtio_request (83): 9p debug: virtio request > > > > 9pnet: -- p9_virtio_request (83): virtio request kicked > > > > > > > > So I suspect we're just not getting a callback. > > > > > > I think so. The patch assumes the driver will call > > > virtqueue_disable/enable_cb() which is not the case of the 9p driver. > > > > > > So after the first interrupt, event_triggered will be set to true forever. > > > > > > Thanks > > > > > > > Hi: Wang > > > > Yes, This patch assumes that all virtio-related drivers will call > > virtqueue_disable/enable_cb(). > > Thank you for raising this issue. > > > > It seems that napi_tx is only related to virtue_net. I'm thinking if > > we need to refactor > > napi_tx instead of implementing it inside virtio_ring. > > We can hear from others. > > I think it's better not to workaround virtio_ring issues in a specific > driver. It might just add more hacks. We should correctly set > VRING_AVAIL_F_NO_INTERRUPT, > > Do you think the following might work (not even a compile test)? ok but: > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index 41144b5246a8..12f4efb6dc54 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -852,16 +852,16 @@ static void virtqueue_disable_cb_split(struct > virtqueue *_vq) > { > struct vring_virtqueue *vq = to_vvq(_vq); > > - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) { > - vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT; > - if (vq->event) > - /* TODO: this is a hack. Figure out a cleaner > value to write. */ > - vring_used_event(&vq->split.vring) = 0x0; > - else > - vq->split.vring.avail->flags = > - cpu_to_virtio16(_vq->vdev, > - vq->split.avail_flags_shadow); > - } > + if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) > + vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT; > + > + if (vq->event && !vq->event_triggered) > + /* TODO: this is a hack. Figure out a cleaner value to write. */ > + vring_used_event(&vq->split.vring) = 0x0; > + else > + vq->split.vring.avail->flags = > + cpu_to_virtio16(_vq->vdev, > + vq->split.avail_flags_shadow); > } > > static unsigned int virtqueue_enable_cb_prepare_split(struct virtqueue *_vq) > @@ -1697,8 +1697,10 @@ static void virtqueue_disable_cb_packed(struct > virtqueue *_vq) > { > struct vring_virtqueue *vq = to_vvq(_vq); > > - if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) { > + if (!(vq->packed.event_flags_shadow & VRING_PACKED_EVENT_FLAG_DISABLE)) > vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE; > + > + if (vq->event_triggered) I don't get this one. if event_triggered why do you still want to write into driver flags? it won't trigger again anytime soon. > vq->packed.vring.driver->flags = > cpu_to_le16(vq->packed.event_flags_shadow); > } > @@ -2330,12 +2332,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq) > { > struct vring_virtqueue *vq = to_vvq(_vq); > > - /* If device triggered an event already it won't trigger one again: > - * no need to disable. > - */ > - if (vq->event_triggered) > - return; > - > if (vq->packed_ring) > virtqueue_disable_cb_packed(_vq); > else > > Thanks I think I prefer Huang Albert's other patch - are you ok with it? > > > > Thanks > > > > > > > > > > > > > > I'll have a closer look after work, but any advice meanwhile will be > > > > appreciated! > > > > (I'm sure Luis would also like a temporary drop from -next until > > > > this is figured out, but I'll leave this up to you) > > > > > > > > > > > > > > > > > > If we disable the napi_tx, it will only be called when the tx ring > > > > > buffer is relatively small. > > > > > > > > > > Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or > > > > > VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update > > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap > > > > > every time we call virtqueue_get_buf_ctx. This will bring more interruptions. > > > > > > > > > > To summarize: > > > > > 1) event_triggered was set to true in vring_interrupt() > > > > > 2) after this nothing will happen for virtqueue_disable_cb() so > > > > > VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow > > > > > 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled > > > > > then it tries to publish new event > > > > > > > > > > To fix, if event_triggered is set to true, do not update > > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap > > > > > > > > > > Tested with iperf: > > > > > iperf3 tcp stream: > > > > > vm1 -----------------> vm2 > > > > > vm2 just receives tcp data stream from vm1, and sends the ack to vm1, > > > > > there are many tx interrupts in vm2. > > > > > but without event_triggered there are just a few tx interrupts. > > > > > > > > > > Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb") > > > > > Signed-off-by: Albert Huang <huangjie.albert@xxxxxxxxxxxxx> > > > > > Message-Id: <20230321085953.24949-1-huangjie.albert@xxxxxxxxxxxxx> > > > > > Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx> > > > > > --- > > > > > drivers/virtio/virtio_ring.c | 6 ++++-- > > > > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > > > > index cbeeea1b0439..1c36fa477966 100644 > > > > > --- a/drivers/virtio/virtio_ring.c > > > > > +++ b/drivers/virtio/virtio_ring.c > > > > > @@ -914,7 +914,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq, > > > > > /* If we expect an interrupt for the next entry, tell host > > > > > * by writing event index and flush out the write before > > > > > * the read in the next get_buf call. */ > > > > > - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) > > > > > + if (unlikely(!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) && > > > > > + !vq->event_triggered)) > > > > > virtio_store_mb(vq->weak_barriers, > > > > > &vring_used_event(&vq->split.vring), > > > > > cpu_to_virtio16(_vq->vdev, vq->last_used_idx)); > > > > > @@ -1744,7 +1745,8 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq, > > > > > * by writing event index and flush out the write before > > > > > * the read in the next get_buf call. > > > > > */ > > > > > - if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC) > > > > > + if (unlikely(vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC && > > > > > + !vq->event_triggered)) > > > > > virtio_store_mb(vq->weak_barriers, > > > > > &vq->packed.vring.driver->off_wrap, > > > > > cpu_to_le16(vq->last_used_idx)); > > > > > > > > > _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization