On Wed, May 16, 2018 at 10:04:20AM +0800, Ming Lei wrote: > On Tue, May 15, 2018 at 05:56:14PM +0800, jianchao.wang wrote: > > Hi ming > > > > On 05/15/2018 08:33 AM, Ming Lei wrote: > > > We still have to quiesce admin queue before canceling request, so looks > > > the following patch is better, so please ignore the above patch and try > > > the following one and see if your hang can be addressed: > > > > > > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c > > > index f509d37b2fb8..c2adc76472a8 100644 > > > --- a/drivers/nvme/host/pci.c > > > +++ b/drivers/nvme/host/pci.c > > > @@ -1741,8 +1741,7 @@ static int nvme_alloc_admin_tags(struct nvme_dev *dev) > > > dev->ctrl.admin_q = NULL; > > > return -ENODEV; > > > } > > > - } else > > > - blk_mq_unquiesce_queue(dev->ctrl.admin_q); > > > + } > > > > > > return 0; > > > } > > > @@ -2520,6 +2519,12 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown, bool > > > */ > > > if (shutdown) > > > nvme_start_queues(&dev->ctrl); > > > + > > > + /* > > > + * Avoid to suck reset because timeout may happen during reset and > > > + * reset may hang forever if admin queue is kept as quiesced > > > + */ > > > + blk_mq_unquiesce_queue(dev->ctrl.admin_q); > > > mutex_unlock(&dev->shutdown_lock); > > > } > > > > w/ patch above and patch below, both the warning and io hung issue didn't reproduce till now. > > > > > > @@ -1450,6 +1648,7 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid) > > { > > struct nvme_dev *dev = nvmeq->dev; > > int result; > > + int cq_vector; > > > > if (dev->cmb && use_cmb_sqes && (dev->cmbsz & NVME_CMBSZ_SQS)) { > > unsigned offset = (qid - 1) * roundup(SQ_SIZE(nvmeq->q_depth), > > @@ -1462,15 +1661,16 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid) > > * A queue's vector matches the queue identifier unless the controller > > * has only one vector available. > > */ > > - nvmeq->cq_vector = dev->num_vecs == 1 ? 0 : qid; > > - result = adapter_alloc_cq(dev, qid, nvmeq); > > + cq_vector = dev->num_vecs == 1 ? 0 : qid; > > + result = adapter_alloc_cq(dev, qid, nvmeq, cq_vector); > > if (result < 0) > > - goto release_vector; > > + goto out; > > Think of this issue further, the above change will cause adapter_alloc_cq() > failed immediately because nvmeq->cq_vector isn't set before submitting this > admin IO. > > So could you check if only the patch("unquiesce admin queue after shutdown > controller") can fix your IO hang issue? > > BTW, the warning from genirq can be left alone, that is another issue. Ooops, no such issue at all since admin queue is ready, please ignore the noise, sorry, :-( Thanks, Ming