On 3/27/19 10:15 AM, Keith Busch wrote: > On Wed, Mar 27, 2019 at 10:03:26AM +0800, jianchao.wang wrote: >> Hi Keith >> >> On 3/27/19 7:57 AM, Keith Busch wrote: >>> On Mon, Mar 25, 2019 at 08:05:53PM -0700, jianchao.wang wrote: >>>> What if there used to be a io scheduler and leave some stale requests of sched tags ? >>>> Or the nr_hw_queues was decreased and leave the hctx->fq->flush_rq ? >>> >>> Requests internally queued in scheduler or block layer are not eligible >>> for the nvme driver's iterator callback. We only use it to reclaim >>> dispatched requests that the target can't return, which only applies to >>> requests that must have a valid rq->tag value from hctx->tags. >>> >>>> The stable request could be some tings freed and used >>>> by others and the state field happen to be overwritten to non-zero... >>> >>> I am not sure I follow what this means. At least for nvme, every queue >>> sharing the same tagset is quiesced and frozen, there should be no >>> request state in flux at the time we iterate. >>> >> >> In nvme_dev_disable, when we try to reclaim the in-flight requests with blk_mq_tagset_busy_iter, >> the request_queues are quiesced but just start-freeze. >> We will try to _drain_ the in-flight requests for the _shutdown_ case when controller is not dead. >> For the reset case, there still could be someone escapes the checking of queue freezing and enters >> blk_mq_make_request and tries to allocate tag, then we may get, >> >> generic_make_request nvme_dev_disable >> -> blk_queue_enter >> -> nvme_start_freeze (just start freeze, no drain) >> -> nvme_stop_queues >> -> blk_mq_make_request >> - > blk_mq_get_request -> blk_mq_tagset_busy_iter >> -> blk_mq_get_tag >> -> bt_tags_for_each >> -> bt_tags_iter >> -> rq = tags->rqs[] ---> [1] >> -> blk_mq_rq_ctx_init >> -> data->hctx->tags->rqs[rq->tag] = rq; >> >> The rq got on position [1] could be a stale request that has been freed due to, >> 1. a hctx->fq.flush_rq of dead request_queue that shares the same tagset >> 2. a removed io scheduler's sched request >> >> And this stale request may have been used by others and the request->state is changed to a non-zero >> value and passes the checking of blk_mq_request_started and then it will be handled by nvme_cancel_request. > > How is that request state going to be anyting other than IDLE? A freed > request state is IDLE, and continues to be IDLE until dispatched. But > dispatch is blocked for the entire tagset, so request states can't be > started during an nvme reset. > As the comment above, the stable request maybe something that has been freed due to following case, 1. a hctx->fq.flush_rq of dead request_queue that shares the same tagset 2. a removed io scheduler's sched request and this freed request could be allocated by others which may change the field of request->state. Thanks Jianchao