Hi Christoph Thanks for your kindly response. On 06/20/2018 10:39 PM, Christoph Hellwig wrote: >> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c >> index 73a97fc..2a161f6 100644 >> --- a/drivers/nvme/host/pci.c >> +++ b/drivers/nvme/host/pci.c >> @@ -1203,6 +1203,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) >> nvme_warn_reset(dev, csts); >> nvme_dev_disable(dev, false); >> nvme_reset_ctrl(&dev->ctrl); >> + __blk_mq_complete_request(req); >> return BLK_EH_DONE; >> } >> >> @@ -1213,6 +1214,11 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) >> dev_warn(dev->ctrl.device, >> "I/O %d QID %d timeout, completion polled\n", >> req->tag, nvmeq->qid); >> + /* >> + * nvme_end_request will invoke blk_mq_complete_request, >> + * it will do nothing for this timed out request. >> + */ >> + __blk_mq_complete_request(req); > > And this clearly is bogus. We want to iterate over the tagetset > and cancel all requests, not do that manually here. > > That was the whole point of the original change. > For nvme-pci, we indeed have an issue that when nvme_reset_work->nvme_dev_disable returns, timeout path maybe still running and the nvme_dev_disable invoked by timeout path will race with the nvme_reset_work. However, the hole is still there right now w/o my changes, but just narrower. Thanks Jianchao