Hi James, On Mon, Nov 01, 2021 at 09:43:27PM -0400, James Bottomley wrote: > On Thu, 2021-10-21 at 22:59 +0800, Ming Lei wrote: > > For fixing queue quiesce race between driver and block layer(elevator > > switch, update nr_requests, ...), we need to support concurrent > > quiesce > > and unquiesce, which requires the two call balanced. > > > > It isn't easy to audit that in all scsi drivers, especially the two > > may > > be called from different contexts, so do it in scsi core with one > > per-device > > bit flag & global spinlock, basically zero cost since request queue > > quiesce > > is seldom triggered. > > > > Reported-by: Yi Zhang <yi.zhang@xxxxxxxxxx> > > Fixes: e70feb8b3e68 ("blk-mq: support concurrent queue > > quiesce/unquiesce") > > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> > > --- > > drivers/scsi/scsi_lib.c | 45 ++++++++++++++++++++++++++++++---- > > ---- > > include/scsi/scsi_device.h | 1 + > > 2 files changed, 37 insertions(+), 9 deletions(-) > > > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > > index 51fcd46be265..414f4daf8005 100644 > > --- a/drivers/scsi/scsi_lib.c > > +++ b/drivers/scsi/scsi_lib.c > > @@ -2638,6 +2638,40 @@ static int > > __scsi_internal_device_block_nowait(struct scsi_device *sdev) > > return 0; > > } > > > > +static DEFINE_SPINLOCK(sdev_queue_stop_lock); > > + > > +void scsi_start_queue(struct scsi_device *sdev) > > +{ > > + bool need_start; > > + unsigned long flags; > > + > > + spin_lock_irqsave(&sdev_queue_stop_lock, flags); > > + need_start = sdev->queue_stopped; > > + sdev->queue_stopped = 0; > > + spin_unlock_irqrestore(&sdev_queue_stop_lock, flags); > > + > > + if (need_start) > > + blk_mq_unquiesce_queue(sdev->request_queue); > > Well, this is a classic atomic pattern: > > if (cmpxchg(&sdev->queue_stopped, 1, 0)) > blk_mq_unquiesce_queue(sdev->request_queue); > > The reason to do it with atomics rather than spinlocks is > > 1. no need to disable interrupts: atomics are locked > 2. faster because a spinlock takes an exclusive line every time but the > read to check the value can be in shared mode in cmpxchg > 3. it's just shorter and better code. You are right, I agree. > > The only minor downside is queue_stopped now needs to be a u32. Yeah, that is the reason I don't take this atomic way since it needs to add one extra u32 into 'struct scsi_device'. Thanks, Ming