(repost) Re: [PATCH] virtio_blk: always post notifications under the lock

"Michael S. Tsirkin" <mst@xxxxxxxxxx> · Wed, 22 Jan 2025 10:15:00 -0500

On Wed, Jan 22, 2025 at 02:44:50PM +0000, Boyer, Andrew wrote:
> 
> 
>     On Jan 9, 2025, at 8:42 AM, Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:
> 
>     On Thu, Jan 09, 2025 at 01:01:20PM +0100, Christian Borntraeger wrote:
> 
> 
>         Am 07.01.25 um 19:25 schrieb Andrew Boyer:
> 
>             Commit af8ececda185 ("virtio: add VIRTIO_F_NOTIFICATION_DATA
>             feature
>             support") added notification data support to the core virtio driver
>             code. When this feature is enabled, the notification includes the
>             updated producer index for the queue. Thus it is now critical that
>             notifications arrive in order.
> 
>             The virtio_blk driver has historically not worried about
>             notification
>             ordering. Modify it so that the prepare and kick steps are both
>             done
>             under the vq lock.
> 
>             Signed-off-by: Andrew Boyer <andrew.boyer@xxxxxxx>
>             Reviewed-by: Brett Creeley <brett.creeley@xxxxxxx>
>             Fixes: af8ececda185 ("virtio: add VIRTIO_F_NOTIFICATION_DATA
>             feature support")
>             Cc: Viktor Prutyanov <viktor@xxxxxxxxxx>
>             Cc: virtualization@xxxxxxxxxxxxxxx
>             Cc: linux-block@xxxxxxxxxxxxxxx
>             ---
>              drivers/block/virtio_blk.c | 19 ++++---------------
>              1 file changed, 4 insertions(+), 15 deletions(-)
> 
>             diff --git a/drivers/block/virtio_blk.c b/drivers/block/
>             virtio_blk.c
>             index 3efe378f1386..14d9e66bb844 100644
>             --- a/drivers/block/virtio_blk.c
>             +++ b/drivers/block/virtio_blk.c
>             @@ -379,14 +379,10 @@ static void virtio_commit_rqs(struct
>             blk_mq_hw_ctx *hctx)
>              {
>                struct virtio_blk *vblk = hctx->queue->queuedata;
>                struct virtio_blk_vq *vq = &vblk->vqs[hctx->queue_num];
>             -   bool kick;
>                spin_lock_irq(&vq->lock);
>             -   kick = virtqueue_kick_prepare(vq->vq);
>             +   virtqueue_kick(vq->vq);
>                spin_unlock_irq(&vq->lock);
>             -
>             -   if (kick)
>             -           virtqueue_notify(vq->vq);
>              }
> 
> 
>         I would assume this will be a performance nightmare for normal IO.
> 
> 
> 
> 
> Hello Michael and Christian and Jason,
> Thank you for taking a look.
> 
> Is the performance concern that the vmexit might lead to the underlying virtual
> storage stack doing the work immediately? Any other job posting to the same
> queue would presumably be blocked on a vmexit when it goes to attempt its own
> notification. That would be almost the same as having the other job block on a
> lock during the operation, although I guess if you are skipping notifications
> somehow it would look different.
> 
> I don't have any sort of setup where I can try it but I would appreciate it if
> someone else could.
> 
> 
>     Hmm. Not good, notify can be very slow, holding a lock is a bad idea.
>     Basically, virtqueue_notify must work ouside of locks, this
>     means af8ececda185 is broken and we did not notice.
> 
>     Let's fix it please.
> 
> 
> With so many broken kernels already in the wild, I think disabling
> F_NOTIFICATION_DATA for virtio-blk would be a reasonable solution.

Some devices might fail feature negotiation then.
I am not sure they are broken, devices might simply be able to
handle out of order values.

> 
>     Try some kind of compare and swap scheme where we detect that index
>     was updated since? Will allow skipping a notification, too.
> 
> 
> Do you have an idea of how this might be done? Anything I've come up with
> involves a lock.
>
> Would it be doable to have a lock for the vq management stuff
> and a second one to post notifications?

and only for when F_NOTIFICATION_DATA is set. not terrible ok I think.

> 
>     AMD guys, can't device survive with reordered notifications?
>     Basically just drop a notification if you see index
>     going back?
> 
> 
> This is the driver lying to us about the state of the queue; it's not going to
> be possible for us to work around it in hardware. For starters, how would we
> detect queue wrap around?
> 
> Thank you,
> Andrew

The index is a running value for split, for wrap arounds, there is
a special bit for that. No?

> 
> 
>     --
>     MST
> 
>