Re: [PATCH 9/9] [RFC] nvme: Fix a race condition

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 09/27/2016 09:56 AM, James Bottomley wrote:
On Tue, 2016-09-27 at 09:43 -0700, Bart Van Assche wrote:
On 09/27/2016 09:31 AM, Steve Wise wrote:
@@ -2079,11 +2075,15 @@ EXPORT_SYMBOL_GPL(nvme_kill_queues);
 void nvme_stop_queues(struct nvme_ctrl *ctrl)
 {
 	struct nvme_ns *ns;
+	struct request_queue *q;

 	mutex_lock(&ctrl->namespaces_mutex);
 	list_for_each_entry(ns, &ctrl->namespaces, list) {
-		blk_mq_cancel_requeue_work(ns->queue);
-		blk_mq_stop_hw_queues(ns->queue);
+		q = ns->queue;
+		blk_quiesce_queue(q);
+		blk_mq_cancel_requeue_work(q);
+		blk_mq_stop_hw_queues(q);
+		blk_resume_queue(q);
 	}
 	mutex_unlock(&ctrl->namespaces_mutex);

Hey Bart, should nvme_stop_queues() really be resuming the blk
queue?

Hello Steve,

Would you perhaps prefer that blk_resume_queue(q) is called from
nvme_start_queues()? I think that would make the NVMe code harder to
review. The above code won't cause any unexpected side effects if an
NVMe namespace is removed after nvme_stop_queues() has been called
and before nvme_start_queues() is called. Moving the
blk_resume_queue(q) call into nvme_start_queues() will only work as
expected if no namespaces are added nor removed between the
nvme_stop_queues() and nvme_start_queues() calls. I'm not familiar
enough with the NVMe code to know whether or not this change is safe
...

It's something that looks obviously wrong, so explain why you need to
do it, preferably in a comment above the function.

Hello James and Steve,

I will add a comment.

Please note that the above patch does not change the behavior of nvme_stop_queues() except that it causes nvme_stop_queues() to wait until any ongoing nvme_queue_rq() calls have finished. blk_resume_queue() does not affect the value of the BLK_MQ_S_STOPPED bit that has been set by blk_mq_stop_hw_queues(). All it does is to resume pending blk_queue_enter() calls and to ensure that future blk_queue_enter() calls do not block. Even after blk_resume_queue() has been called if a new request is queued queue_rq() won't be invoked because the BLK_MQ_S_STOPPED bit is still set. Patch "dm: Fix a race condition related to stopping and starting queues" realizes a similar change in the dm driver and that change has been tested extensively.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux