Re: [PATCH] mpt3sas: Fix kernel panic observed on soft HBA unplug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2020-03-11 at 06:36 -0400, Sreekanth Reddy wrote:
> Generic protection fault type kernel panic is observed when user
> performs soft(ordered) HBA unplug operation while IOs are running
> on drives connected to HBA.
> 
> When user performs ordered HBA removal operation then kernel calls
> PCI device's .remove() call back function where driver is flushing
> out
> all the outstanding SCSI IO commands with DID_NO_CONNECT host byte
> and
> also un-maps sg buffers allocated for these IO commands.
> But in the ordered HBA removal case (unlike of real HBA hot unplug)
> HBA device is still alive and hence HBA hardware is performing the
> DMA operations to those buffers on the system memory which are
> already
> unmapped while flushing out the outstanding SCSI IO commands
> and this leads to Kernel panic.
> 
> Fix:
> Don't flush out the outstanding IOs from .remove() path in case of
> ordered HBA removal since HBA will be still alive in this case and
> it can complete the outstanding IOs. Flush out the outstanding IOs
> only in case physical HBA hot unplug where their won't be any
> communication with the HBA.

Can you please point to the commit that introduces the bug?

> 
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Sreekanth Reddy <sreekanth.reddy@xxxxxxxxxxxx>
> ---
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> index 778d5e6..04a40af 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> @@ -9908,8 +9908,8 @@ static void scsih_remove(struct pci_dev *pdev)
>  
>  	ioc->remove_host = 1;
>  
> -	mpt3sas_wait_for_commands_to_complete(ioc);
> -	_scsih_flush_running_cmds(ioc);
> +	if (!pci_device_is_present(pdev))
> +		_scsih_flush_running_cmds(ioc);
>  
>  	_scsih_fw_event_cleanup_queue(ioc);
>  
> @@ -9992,8 +9992,8 @@ static void scsih_remove(struct pci_dev *pdev)

Just a note: this function is scsih_shutdown().  Doesn't block
application of the patch, though.  Just wondering how the patch was
created.

>  
>  	ioc->remove_host = 1;
>  
> -	mpt3sas_wait_for_commands_to_complete(ioc);
> -	_scsih_flush_running_cmds(ioc);
> +	if (!pci_device_is_present(pdev))
> +		_scsih_flush_running_cmds(ioc);
>  
>  	_scsih_fw_event_cleanup_queue(ioc);
>  




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux