On 11/2/21 8:33 AM, James Bottomley wrote: > On Tue, 2021-11-02 at 06:59 -0600, Jens Axboe wrote: >> On 11/1/21 7:43 PM, James Bottomley wrote: >>> On Thu, 2021-10-21 at 22:59 +0800, Ming Lei wrote: >>>> For fixing queue quiesce race between driver and block >>>> layer(elevator switch, update nr_requests, ...), we need to >>>> support concurrent quiesce and unquiesce, which requires the two >>>> call balanced. >>>> >>>> It isn't easy to audit that in all scsi drivers, especially the >>>> two may be called from different contexts, so do it in scsi core >>>> with one per-device bit flag & global spinlock, basically zero >>>> cost since request queue quiesce is seldom triggered. >>>> >>>> Reported-by: Yi Zhang <yi.zhang@xxxxxxxxxx> >>>> Fixes: e70feb8b3e68 ("blk-mq: support concurrent queue >>>> quiesce/unquiesce") >>>> Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> >>>> --- >>>> drivers/scsi/scsi_lib.c | 45 ++++++++++++++++++++++++++++++ >>>> ---- >>>> ---- >>>> include/scsi/scsi_device.h | 1 + >>>> 2 files changed, 37 insertions(+), 9 deletions(-) >>>> >>>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c >>>> index 51fcd46be265..414f4daf8005 100644 >>>> --- a/drivers/scsi/scsi_lib.c >>>> +++ b/drivers/scsi/scsi_lib.c >>>> @@ -2638,6 +2638,40 @@ static int >>>> __scsi_internal_device_block_nowait(struct scsi_device *sdev) >>>> return 0; >>>> } >>>> >>>> +static DEFINE_SPINLOCK(sdev_queue_stop_lock); >>>> + >>>> +void scsi_start_queue(struct scsi_device *sdev) >>>> +{ >>>> + bool need_start; >>>> + unsigned long flags; >>>> + >>>> + spin_lock_irqsave(&sdev_queue_stop_lock, flags); >>>> + need_start = sdev->queue_stopped; >>>> + sdev->queue_stopped = 0; >>>> + spin_unlock_irqrestore(&sdev_queue_stop_lock, flags); >>>> + >>>> + if (need_start) >>>> + blk_mq_unquiesce_queue(sdev->request_queue); >>> >>> Well, this is a classic atomic pattern: >>> >>> if (cmpxchg(&sdev->queue_stopped, 1, 0)) >>> blk_mq_unquiesce_queue(sdev->request_queue); >>> >>> The reason to do it with atomics rather than spinlocks is >>> >>> 1. no need to disable interrupts: atomics are locked >>> 2. faster because a spinlock takes an exclusive line every time >>> but the >>> read to check the value can be in shared mode in cmpxchg >>> 3. it's just shorter and better code. >>> >>> The only minor downside is queue_stopped now needs to be a u32. >> >> Are you fine with the change as-is, or do you want it redone? I >> can drop the SCSI parts and just queue up the dm fix. Personally >> I think it'd be better to get it fixed upfront. > > Well, given the path isn't hot, I don't really care. However, what I > don't want is to have to continually bat back patches from the make > work code churners trying to update this code for being the wrong > pattern. I think at the very least it needs a comment saying why we > chose a suboptimal pattern to try to forestall this. Right, with a comment it's probably better. And as you said, since it's not a hot path, don't think we'd be revisiting it anyway. I'll amend the patch with a comment. -- Jens Axboe