Re: [BUG] nvme driver crash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 7/26/2017 10:12 PM, Omar Sandoval wrote:
On Wed, Jul 26, 2017 at 10:34:43AM -0700, Christoph Hellwig wrote:
On Tue, Jul 25, 2017 at 03:24:08PM -0700, Shaohua Li wrote:
Disable CONIFG_SMP, kernel crashes at boot time, here is the log.

I can reproduce the issue.  Unfortunately the addresss in the bug
doesn't make any sense to me when resolving it using gdb, as t just
points to the line where blk_mq_init_queue calls
blk_alloc_queue_node.

Can you check if you get better results in your build?

It's crashing on that line because ctrl->tagset is NULL, which is
because...

irq_create_affinity_masks() returns NULL on !CONFIG_SMP
-> pci_irq_get_affinity() returns NULL
-> blk_mq_pci_map_queues() returns -EINVAL
-> blk_mq_alloc_tag_set() returns -EINVAL
-> nvme_dev_add() doesn't set ctrl->tagset

The two-fold fix would be to make the nvme driver handle
blk_mq_alloc_tag_set() failing and to fall back to a dumb mapping in
blk_mq_pci_map_queues(), but I don't know what the best way to do those
is.


Adding Sagi and Keith.

Christoph,
I've send some fix few months ago to that but haven't got a green light:


nvme: don't ignore tagset allocation failures

the nvme_dev_add() function silently ignores failures.
In case blk_mq_alloc_tag_set fails, we hit NULL deref while
calling blk_mq_init_queue during nvme_alloc_ns with tagset == NULL.
Instead, we'll not issue the scan_work in case tagset allocation
failed and leave the ctrl functional.

Signed-off-by: Max Gurtovoy <maxg@xxxxxxxxxxxx>
Reviewed-by: Keith Busch <keith.busch@xxxxxxxxx>
---
 drivers/nvme/host/core.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 9b3b57f..493722a 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2115,9 +2115,9 @@ void nvme_queue_scan(struct nvme_ctrl *ctrl)
 {
 	/*
 	 * Do not queue new scan work when a controller is reset during
-	 * removal.
+	 * removal or if the tagset doesn't exist.
 	 */
-	if (ctrl->state == NVME_CTRL_LIVE)
+	if (ctrl->state == NVME_CTRL_LIVE && ctrl->tagset)
 		schedule_work(&ctrl->scan_work);
 }
 EXPORT_SYMBOL_GPL(nvme_queue_scan);


maybe we can rebase and consider it again ?




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux