RE: connect cmd error for nvme-rdma with eventual kernel crash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jens,

> -----Original Message-----
> From: Jens Axboe [mailto:axboe@xxxxxxxxx]
> Subject: Re: connect cmd error for nvme-rdma with eventual kernel crash
> 
> > On Feb 28, 2017, at 5:57 PM, Parav Pandit <parav@xxxxxxxxxxxx> wrote:
> >
> > Hi Jens,
> >
> > With your commit 2af8cbe30531eca73c8f3ba277f155fc0020b01a in
> > linux-block git tree, There are two requests tables. Static and dynamic of
> same size.
> > However function blk_mq_tag_to_rq() always try to get the tag from the
> dynamic table which doesn't seem to be always initialized.
> >
> > I am running nvme-rdma initiator and it fails to find the request for the
> given tag when command completes.
> > Command triggers error recovery with "tag not found" error.
> > Eventually kernel is crashing in blk_mq_queue_tag_busy_iter() with NULL
> pointer. Seems to be additional bug in error recovery.
> >
> > To debug, I added initializing dynamic tags as well.
> >
> > blk_mq_alloc_rqs() {
> >            tags->static_rqs[i] = rq;
> > +            tags->rqs[i] = rq;
> >
> > This appears to resolve the issue. But that's not the fix.
> > It appears to me that nvme stack is broken in certain conditions with recent
> static and dynamic rq tables change.
> 
> Can you try my for-linus branch?

I tried for-linus branch and it works.

Seems like ac6e0c2d633ab0411810fe6b15a40808309041db fixes it.
__blk_mq_alloc_request() 
data->hctx->tags->rqs[rq->tag] = rq;

Commit says no functional difference but it is actually fixing this issue.

Parav




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux