Re: [PATCH] blk-mq: Add NULL pointer check for HW dispatch queue

Somnath Kotur <somnath.kotur@xxxxxxxxxxxx> · Tue, 28 Mar 2017 14:05:18 +0530

On Mon, Mar 27, 2017 at 2:44 PM, Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
>
> On Mon, Mar 20, 2017 at 03:10:01PM +0530, Jitendra Bhivare wrote:
> > As part of blk_mq_realloc_hw_ctx(), if the init_hctx() ops is
> > failed by the underyling transport, the hctx pointer is freed and
> > initialized to NULL.
> > However, functions down the line, access this hwctx pointer without
> > a NULL pointer check, which could lead to a kernel crash.
>
> Shouldn't we fail initializing the queue if any of the hctx allocations
> fail?

Well, just to give a better background of the issue, here is the
dump_stack of where/when the failure happens

Mar 18 08:27:31 dhcp-10-192-204-6 kernel: [<ffffffffa05d42d6>]
ib_alloc_mr+0x26/0x50 [ib_core]
Mar 18 08:27:31 dhcp-10-192-204-6 kernel: [<ffffffffa0a37691>]
__nvme_rdma_init_request+0xc1/0x230 [nvme_rdma]
Mar 18 08:27:31 dhcp-10-192-204-6 kernel: [<ffffffffa0a37831>]
nvme_rdma_init_request+0x11/0x20 [nvme_rdma]
Mar 18 08:27:31 dhcp-10-192-204-6 kernel: [<ffffffff813429bb>]
blk_mq_init_rq_map+0x23b/0x2b0
Mar 18 08:27:31 dhcp-10-192-204-6 kernel: [<ffffffff81342e25>]
blk_mq_alloc_tag_set+0x135/0x2c0
Mar 18 08:27:31 dhcp-10-192-204-6 kernel: [<ffffffffa0a37cc3>]
nvme_rdma_create_ctrl+0x483/0x710 [nvme_rdma]
Mar 18 08:27:31 dhcp-10-192-204-6 kernel: [<ffffffffa0a2c127>]
nvmf_dev_write+0x727/0x93c [nvme_fabrics]
Mar 18 08:27:31 dhcp-10-192-204-6 kernel: [<ffffffff812320e7>]
__vfs_write+0x37/0x160

the ctrl->queue_count in nvme_rdma_create_ctrl() is initialized like so:

ctrl->queue_count = opts->nr_io_queues + 1; /* +1 for admin queue */

where opts->nr_io_queues is typically set to num_online_cpus() which
in my case turned out to be 16, while the failure i encountered was
for the 14th CPU , the failure being alloc_mr() because we reached the
limitation of MRs in our chip.

The point is that post this failure, functions like
blk_mq_init_cpu_queues() and blk_mq_map_swqueue() use code snippet
like below to access the hctxs:

for_each_possible_cpu(i) {
....
 hctx = blk_mq_map_queue(q, i);
 hctx->....                                          // crash if ptr is NULL
..
}

I'm not that familiar with the blk code itself, so perhaps there is a
better way of fixing it, but have pointed out the problem and a
possible fix, this is more of a bug
in the error-handling path?

Thanks
Som