Hi Jens, > -----Original Message----- > From: Jens Axboe [mailto:axboe@xxxxxxxxx] > Subject: Re: connect cmd error for nvme-rdma with eventual kernel crash > > > On Feb 28, 2017, at 5:57 PM, Parav Pandit <parav@xxxxxxxxxxxx> wrote: > > > > Hi Jens, > > > > With your commit 2af8cbe30531eca73c8f3ba277f155fc0020b01a in > > linux-block git tree, There are two requests tables. Static and dynamic of > same size. > > However function blk_mq_tag_to_rq() always try to get the tag from the > dynamic table which doesn't seem to be always initialized. > > > > I am running nvme-rdma initiator and it fails to find the request for the > given tag when command completes. > > Command triggers error recovery with "tag not found" error. > > Eventually kernel is crashing in blk_mq_queue_tag_busy_iter() with NULL > pointer. Seems to be additional bug in error recovery. > > > > To debug, I added initializing dynamic tags as well. > > > > blk_mq_alloc_rqs() { > > tags->static_rqs[i] = rq; > > + tags->rqs[i] = rq; > > > > This appears to resolve the issue. But that's not the fix. > > It appears to me that nvme stack is broken in certain conditions with recent > static and dynamic rq tables change. > > Can you try my for-linus branch? I tried for-linus branch and it works. Seems like ac6e0c2d633ab0411810fe6b15a40808309041db fixes it. __blk_mq_alloc_request() data->hctx->tags->rqs[rq->tag] = rq; Commit says no functional difference but it is actually fixing this issue. Parav