RE: [PATCH 9/9] [RFC] nvme: Fix a race condition

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On 09/27/2016 09:31 AM, Steve Wise wrote:
> >> @@ -2079,11 +2075,15 @@ EXPORT_SYMBOL_GPL(nvme_kill_queues);
> >>  void nvme_stop_queues(struct nvme_ctrl *ctrl)
> >>  {
> >>  	struct nvme_ns *ns;
> >> +	struct request_queue *q;
> >>
> >>  	mutex_lock(&ctrl->namespaces_mutex);
> >>  	list_for_each_entry(ns, &ctrl->namespaces, list) {
> >> -		blk_mq_cancel_requeue_work(ns->queue);
> >> -		blk_mq_stop_hw_queues(ns->queue);
> >> +		q = ns->queue;
> >> +		blk_quiesce_queue(q);
> >> +		blk_mq_cancel_requeue_work(q);
> >> +		blk_mq_stop_hw_queues(q);
> >> +		blk_resume_queue(q);
> >>  	}
> >>  	mutex_unlock(&ctrl->namespaces_mutex);
> >
> > Hey Bart, should nvme_stop_queues() really be resuming the blk queue?
> 
> Hello Steve,
> 
> Would you perhaps prefer that blk_resume_queue(q) is called from
> nvme_start_queues()? I think that would make the NVMe code harder to
> review. 

I'm still learning the blk code (and nvme code :)), but I would think
blk_resume_queue() would cause requests to start being submit on the NVME
queues, which I believe shouldn't happen when they are stopped.  I'm currently
debugging a problem where requests are submitted to the nvme-rdma driver while
it has supposedly stopped all the nvme and blk mqs.  I tried your series at
Christoph's request to see if it resolved my problem, but it didn't.  

> The above code won't cause any unexpected side effects if an
> NVMe namespace is removed after nvme_stop_queues() has been called and
> before nvme_start_queues() is called. Moving the blk_resume_queue(q)
> call into nvme_start_queues() will only work as expected if no
> namespaces are added nor removed between the nvme_stop_queues() and
> nvme_start_queues() calls. I'm not familiar enough with the NVMe code to
> know whether or not this change is safe ...
> 

I'll have to look and see if new namespaces can be added/deleted while a nvme
controller is in the RECONNECTING state.   In the meantime, I'm going to move
the blk_resume_queue() to nvme_start_queues() and see if it helps my problem.

Christoph:  Thoughts?

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux