Re: [PATCH v3 2/2] nvme: use blk_mq_[un]quiesce_tagset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2022/10/20 14:11, Sagi Grimberg wrote:


On 10/20/22 06:53, Chao Leng wrote:
All controller namespaces share the same tagset, so we can use this
interface which does the optimal operation for parallel quiesce based on
the tagset type(e.g. blocking tagsets and non-blocking tagsets).

nvme connect_q should not be quiesced when quiesce tagset, so set the
QUEUE_FLAG_SKIP_TAGSET_QUIESCE to skip it when init connect_q.

Currntely we use NVME_NS_STOPPED to ensure pairing quiescing and
unquiescing. If use blk_mq_[un]quiesce_tagset, NVME_NS_STOPPED will be
invalided, so introduce NVME_CTRL_STOPPED to replace NVME_NS_STOPPED.
In addition, we never really quiesce a single namespace. It is a better
choice to move the flag from ns to ctrl.

Signed-off-by: Sagi Grimberg <sagi@xxxxxxxxxxx>
Signed-off-by: Chao Leng <lengchao@xxxxxxxxxx>
---
  drivers/nvme/host/core.c | 57 +++++++++++++++++++-----------------------------
  drivers/nvme/host/nvme.h |  3 ++-
  2 files changed, 25 insertions(+), 35 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 059737c1a2c1..c7727d1f228e 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4890,6 +4890,7 @@ int nvme_alloc_io_tag_set(struct nvme_ctrl *ctrl, struct blk_mq_tag_set *set,
              ret = PTR_ERR(ctrl->connect_q);
              goto out_free_tag_set;
          }
+        blk_queue_flag_set(QUEUE_FLAG_SKIP_TAGSET_QUIESCE, ctrl->connect_q);
      }
      ctrl->tagset = set;
@@ -5013,6 +5014,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
      clear_bit(NVME_CTRL_FAILFAST_EXPIRED, &ctrl->flags);
      spin_lock_init(&ctrl->lock);
      mutex_init(&ctrl->scan_lock);
+    mutex_init(&ctrl->queue_state_lock);

Why is this lock needed?
It is used to secure the process which need that the queue must be quiesced.
The scenario:(without the lock)
Thread A: call nvme_stop_queues and set the NVME_CTRL_STOPPED and then quiesce the tagset
          and wait the grace period.
Thread B: call nvme_stop_queues, because the NVME_CTRL_STOPPED is already setted,
          continue to do something which need that the queue must be quiesced,
          because the grace period of the queue is not ended, may cause abnormal.
Thread A: the grace period end, and continue.
So add a lock to ensure that all queues are quiesced after set the NVME_CTRL_STOPPED.

The old code was implemented by forcing a wait for the grace period. Show the code:
	if (!test_and_set_bit(NVME_NS_STOPPED, &ns->flags))
		blk_mq_quiesce_queue(ns->queue);
	else
		blk_mq_wait_quiesce_done(ns->queue);
The old code was not absolutely safe, such as this scenario:
Thread A: test_and_set_bit, and interrupt by hardware irq, lost chance to run.
Thread B: test_and_set_bit, and wait the grace period, and then continue
          to do something which need that the queue must be quiesced,
          because the queue is not quiesced, may cause abnormal.
Thread A: get the chance to run, blk_mq_quiesce_queue, and then continue.

.



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux