On Wed, Jan 22, 2025 at 05:45:28PM +0000, Boyer, Andrew wrote: > > > > On Jan 22, 2025, at 10:13 AM, Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: > > > > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. > > > > > > On Wed, Jan 22, 2025 at 02:44:50PM +0000, Boyer, Andrew wrote: > >> > >> > >> On Jan 9, 2025, at 8:42 AM, Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: > >> > >> On Thu, Jan 09, 2025 at 01:01:20PM +0100, Christian Borntraeger wrote: > >> > >> > >> Am 07.01.25 um 19:25 schrieb Andrew Boyer: > >> > >> Commit af8ececda185 ("virtio: add VIRTIO_F_NOTIFICATION_DATA > >> feature > >> support") added notification data support to the core virtio driver > >> code. When this feature is enabled, the notification includes the > >> updated producer index for the queue. Thus it is now critical that > >> notifications arrive in order. > >> > >> The virtio_blk driver has historically not worried about > >> notification > >> ordering. Modify it so that the prepare and kick steps are both > >> done > >> under the vq lock. > >> > >> Signed-off-by: Andrew Boyer <andrew.boyer@xxxxxxx> > >> Reviewed-by: Brett Creeley <brett.creeley@xxxxxxx> > >> Fixes: af8ececda185 ("virtio: add VIRTIO_F_NOTIFICATION_DATA > >> feature support") > >> Cc: Viktor Prutyanov <viktor@xxxxxxxxxx> > >> Cc: virtualization@xxxxxxxxxxxxxxx > >> Cc: linux-block@xxxxxxxxxxxxxxx > >> --- > >> drivers/block/virtio_blk.c | 19 ++++--------------- > >> 1 file changed, 4 insertions(+), 15 deletions(-) > >> > >> diff --git a/drivers/block/virtio_blk.c b/drivers/block/ > >> virtio_blk.c > >> index 3efe378f1386..14d9e66bb844 100644 > >> --- a/drivers/block/virtio_blk.c > >> +++ b/drivers/block/virtio_blk.c > >> @@ -379,14 +379,10 @@ static void virtio_commit_rqs(struct > >> blk_mq_hw_ctx *hctx) > >> { > >> struct virtio_blk *vblk = hctx->queue->queuedata; > >> struct virtio_blk_vq *vq = &vblk->vqs[hctx->queue_num]; > >> - bool kick; > >> spin_lock_irq(&vq->lock); > >> - kick = virtqueue_kick_prepare(vq->vq); > >> + virtqueue_kick(vq->vq); > >> spin_unlock_irq(&vq->lock); > >> - > >> - if (kick) > >> - virtqueue_notify(vq->vq); > >> } > >> > >> > >> I would assume this will be a performance nightmare for normal IO. > >> > >> > >> > >> > >> Hello Michael and Christian and Jason, > >> Thank you for taking a look. > >> > >> Is the performance concern that the vmexit might lead to the underlying virtual > >> storage stack doing the work immediately? Any other job posting to the same > >> queue would presumably be blocked on a vmexit when it goes to attempt its own > >> notification. That would be almost the same as having the other job block on a > >> lock during the operation, although I guess if you are skipping notifications > >> somehow it would look different. > >> > >> I don't have any sort of setup where I can try it but I would appreciate it if > >> someone else could. > >> > >> > >> Hmm. Not good, notify can be very slow, holding a lock is a bad idea. > >> Basically, virtqueue_notify must work ouside of locks, this > >> means af8ececda185 is broken and we did not notice. > >> > >> Let's fix it please. > >> > >> > >> With so many broken kernels already in the wild, I think disabling > >> F_NOTIFICATION_DATA for virtio-blk would be a reasonable solution. > > > > Some devices might fail feature negotiation then. > > I am not sure they are broken, devices might simply be able to > > handle out of order values. > > > > A driver which does not support F_NOTIFICATION_DATA should just clear that bit. Surely devices which support it would also support not enabling it? Otherwise pre-6.4 kernels wouldn't work at all. > > > > >> > >> Try some kind of compare and swap scheme where we detect that index > >> was updated since? Will allow skipping a notification, too. > >> > >> > >> Do you have an idea of how this might be done? Anything I've come up with > >> involves a lock. > >> > >> Would it be doable to have a lock for the vq management stuff > >> and a second one to post notifications? > > > > > > and only for when F_NOTIFICATION_DATA is set. not terrible ok I think. > > > >> > >> AMD guys, can't device survive with reordered notifications? > >> Basically just drop a notification if you see index > >> going back? > >> > >> > >> This is the driver lying to us about the state of the queue; it's not going to > >> be possible for us to work around it in hardware. For starters, how would we > >> detect queue wrap around? > >> > >> Thank you, > >> Andrew > > > > The index is a running value for split, for wrap arounds, there is > > a special bit for that. No? > > > > This is a hardware block used for many different interfaces and devices. When the notification write comes through, the doorbell block updates the queue state and schedules the queue for work. If a second notification comes in and overwrites that update before the queue is able to run (going backwards but not wrapping), we'll have no way of detecting it. > > -Andrew Do you not have programmable hardware that can compare current queue state and the doorbell *before* overwriting it?