Re: [PATCH] virtio_blk: always post notifications under the lock

"Boyer, Andrew" <Andrew.Boyer@xxxxxxx> · Wed, 22 Jan 2025 14:48:10 +0000

On Jan 9, 2025, at 8:42 AM, Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:
> 
> On Thu, Jan 09, 2025 at 01:01:20PM +0100, Christian Borntraeger wrote:
>> 
>> 
>> Am 07.01.25 um 19:25 schrieb Andrew Boyer:
>>> Commit af8ececda185 ("virtio: add VIRTIO_F_NOTIFICATION_DATA feature
>>> support") added notification data support to the core virtio driver
>>> code. When this feature is enabled, the notification includes the
>>> updated producer index for the queue. Thus it is now critical that
>>> notifications arrive in order.
>>> 
>>> The virtio_blk driver has historically not worried about notification
>>> ordering. Modify it so that the prepare and kick steps are both done
>>> under the vq lock.
>>> 
>>> Signed-off-by: Andrew Boyer <andrew.boyer@xxxxxxx>
>>> Reviewed-by: Brett Creeley <brett.creeley@xxxxxxx>
>>> Fixes: af8ececda185 ("virtio: add VIRTIO_F_NOTIFICATION_DATA feature support")
>>> Cc: Viktor Prutyanov <viktor@xxxxxxxxxx>
>>> Cc: virtualization@xxxxxxxxxxxxxxx
>>> Cc: linux-block@xxxxxxxxxxxxxxx
>>> ---
>>>  drivers/block/virtio_blk.c | 19 ++++---------------
>>>  1 file changed, 4 insertions(+), 15 deletions(-)
>>> 
>>> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
>>> index 3efe378f1386..14d9e66bb844 100644
>>> --- a/drivers/block/virtio_blk.c
>>> +++ b/drivers/block/virtio_blk.c
>>> @@ -379,14 +379,10 @@ static void virtio_commit_rqs(struct blk_mq_hw_ctx *hctx)
>>>  {
>>>    struct virtio_blk *vblk = hctx->queue->queuedata;
>>>    struct virtio_blk_vq *vq = &vblk->vqs[hctx->queue_num];
>>> -   bool kick;
>>>    spin_lock_irq(&vq->lock);
>>> -   kick = virtqueue_kick_prepare(vq->vq);
>>> +   virtqueue_kick(vq->vq);
>>>    spin_unlock_irq(&vq->lock);
>>> -
>>> -   if (kick)
>>> -           virtqueue_notify(vq->vq);
>>>  }
>> 
>> I would assume this will be a performance nightmare for normal IO.
> 

Hello Michael and Christian and Jason,
Thank you for taking a look.

Is the performance concern that the vmexit might lead to the underlying virtual storage stack doing the work immediately? Any other job posting to the same queue would presumably be blocked on a vmexit when it goes to attempt its own notification. That would be almost the same as having the other job block on a lock during the operation, although I guess if you are skipping notifications somehow it would look different.

I don't have any sort of setup where I can try it but I would appreciate it if someone else could.

> Hmm. Not good, notify can be very slow, holding a lock is a bad idea.
> Basically, virtqueue_notify must work ouside of locks, this
> means af8ececda185 is broken and we did not notice.
> 
> Let's fix it please.

With so many broken kernels already in the wild, I think disabling F_NOTIFICATION_DATA for virtio-blk would be a reasonable solution.

> Try some kind of compare and swap scheme where we detect that index
> was updated since? Will allow skipping a notification, too.

Do you have an idea of how this might be done? Anything I've come up with involves a lock. Would it be doable to have a lock for the vq management stuff and a second one to post notifications?

> AMD guys, can't device survive with reordered notifications?
> Basically just drop a notification if you see index
> going back?

This is the driver lying to us about the state of the queue; it's not going to be possible for us to work around it in hardware. For starters, how would we detect queue wrap around?

Thank you,
Andrew

(Apologies for sending twice, mail client issue)