https://bugzilla.kernel.org/show_bug.cgi?id=216453 Bug ID: 216453 Summary: scsi: megaraid_sas: possible null pointer dereference in megasas_slave_alloc() Product: IO/Storage Version: 2.5 Kernel Version: 5.10.0 Hardware: ARM OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: SCSI Assignee: linux-scsi@xxxxxxxxxxxxxxx Reporter: r33s3n6@xxxxxxxxx Regression: No Hello, Our fault injection tool finds a possible null-pointer dereference in the megaraid_sas driver in Linux 5.10.0: In the file drivers/scsi/megaraid/megaraid_sas_base.c: In megasas_get_seq_num(), the call to dma_alloc_coherent() may fail: 6459: el_info = dma_alloc_coherent(&instance->pdev->dev, sizeof(struct megasas_evt_log_info), &el_info_h, GFP_KERNEL); This error is propagated to its caller megasas_start_aen(). 6749: if (megasas_get_seq_num(instance, &eli)) 6750: return -1; Then it is propagated again to its caller megasas_probe_one(). 7428: if (megasas_start_aen(instance)) { 7429: dev_printk(KERN_DEBUG, &pdev->dev, "start aen failed\n"); 7430: goto fail_start_aen; 7431: } In error handling code of megasas_probe_one(), it removes the pointer `instance` from `megasas_mgmt_info.instance`: 7445: megasas_mgmt_info.instance[megasas_mgmt_info.max_index] = NULL; But it stores the pointer `instance` in the pdev by calling pci_set_drvdata() before and do nothing about it in error handling code: 7401: pci_set_drvdata(pdev, instance); Then, in another thread, megasas_slave_alloc() is called. This function calls megasas_lookup_instance() to get the pointer `instance`, which can not be found in `megasas_mgmt_info.instance`. Therefore, NULL is returned: 2087: instance = megasas_lookup_instance(sdev->host->host_no); This causes a null-pointer dereference bug: 2095: if ((instance->pd_list_not_supported || instance->pd_list[pd_index].driveState == MR_PD_STATE_SYSTEM)) If we just add a check for `instance`, another bug is found. megasas_fault_detect_work() is called by a thread. and it retrieves the pointer `instance` from `work`: In the file drivers/scsi/megaraid/megaraid_sas_base.c: 1901: struct megasas_instance *instance = container_of(work, struct megasas_instance, fw_fault_work.work); Because the structure `instance` points to is broken, the following calls about `instance` causes some page-faults: 1907: fw_state = instance->instancet->read_fw_status_reg(instance) & MFI_STATE_MASK; 1911: dma_state = instance->instancet->read_fw_status_reg(instance) & MFI_STATE_DMADONE; ... I am not quite sure how to fix this possible bug. Any feedback would be appreciated, thanks! Reported-by: TOTE Robot <oslab@xxxxxxxxxxxxxxx> Best wishes, Zixuan Fu -- You may reply to this email to add a comment. You are receiving this mail because: You are the assignee for the bug.