Re: [PATCH V4 0/10] block/scsi: safe SCSI quiescing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/11/2017 07:10 AM, Ming Lei wrote:
Hi,

The current SCSI quiesce isn't safe and easy to trigger I/O deadlock.

Once SCSI device is put into QUIESCE, no new request except for
RQF_PREEMPT can be dispatched to SCSI successfully, and
scsi_device_quiesce() just simply waits for completion of I/Os
dispatched to SCSI stack. It isn't enough at all.

Because new request still can be comming, but all the allocated
requests can't be dispatched successfully, so request pool can be
consumed up easily.

Then request with RQF_PREEMPT can't be allocated and wait forever,
meantime scsi_device_resume() waits for completion of RQF_PREEMPT,
then system hangs forever, such as during system suspend or
sending SCSI domain alidation.

Both IO hang inside system suspend[1] or SCSI domain validation
were reported before.

This patch introduces preempt freeze, and solves the issue
by preempt freezing block queue during SCSI quiesce, and allows
to allocate request of RQF_PREEMPT when queue is in this state.

Oleksandr verified that V3 does fix the hang during suspend/resume,
and Cathy verified that revised V3 fixes hang in sending
SCSI domain validation.

Both SCSI and SCSI_MQ have this IO deadlock issue, this patch fixes
them all by introducing/unifying blk_freeze_queue_preempt() and
blk_unfreeze_queue_preempt(), and cleanup is done together.

The patchset can be found in the following gitweb:

	https://github.com/ming1/linux/tree/blk_safe_scsi_quiesce_V4

V4:
	- reorganize patch order to make it more reasonable
	- support nested preempt freeze, as required by SCSI transport spi
	- check preempt freezing in slow path of of blk_queue_enter()
	- add "SCSI: transport_spi: resume a quiesced device"
	- wake up freeze queue in setting dying for both blk-mq and legacy
	- rename blk_mq_[freeze|unfreeze]_queue() in one patch
	- rename .mq_freeze_wq and .mq_freeze_depth
	- improve comment

V3:
	- introduce q->preempt_unfreezing to fix one bug of preempt freeze
	- call blk_queue_enter_live() only when queue is preempt frozen
	- cleanup a bit on the implementation of preempt freeze
	- only patch 6 and 7 are changed

V2:
	- drop the 1st patch in V1 because percpu_ref_is_dying() is
	enough as pointed by Tejun
	- introduce preempt version of blk_[freeze|unfreeze]_queue
	- sync between preempt freeze and normal freeze
	- fix warning from percpu-refcount as reported by Oleksandr


[1]https://marc.info/?t=150340250100013&r=3&w=2


Thanks,
Ming


Ming Lei (10):
   blk-mq: only run hw queues for blk-mq
   block: tracking request allocation with q_usage_counter
   blk-mq: rename blk_mq_[freeze|unfreeze]_queue
   blk-mq: rename blk_mq_freeze_queue_wait as blk_freeze_queue_wait
   block: rename .mq_freeze_wq and .mq_freeze_depth
   block: pass flags to blk_queue_enter()
   block: introduce preempt version of blk_[freeze|unfreeze]_queue
   block: allow to allocate req with RQF_PREEMPT when queue is preempt
     frozen
   SCSI: transport_spi: resume a quiesced device
   SCSI: preempt freeze block queue when SCSI device is put into quiesce

  block/bfq-iosched.c               |   2 +-
  block/blk-cgroup.c                |   8 +-
  block/blk-core.c                  |  95 ++++++++++++++++----
  block/blk-mq.c                    | 180 ++++++++++++++++++++++++++++----------
  block/blk-mq.h                    |   1 -
  block/blk-timeout.c               |   2 +-
  block/blk.h                       |  12 +++
  block/elevator.c                  |   4 +-
  drivers/block/loop.c              |  24 ++---
  drivers/block/rbd.c               |   2 +-
  drivers/nvme/host/core.c          |   8 +-
  drivers/scsi/scsi_lib.c           |  25 +++++-
  drivers/scsi/scsi_transport_spi.c |   3 +
  fs/block_dev.c                    |   4 +-
  include/linux/blk-mq.h            |  15 ++--
  include/linux/blkdev.h            |  32 +++++--
  16 files changed, 313 insertions(+), 104 deletions(-)


I've tested this patch set for spi_transport issuing a domain validation under low blk_request conditions.

Tested-by: Cathy Avery <cavery@xxxxxxxxxx>



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux