Re: [PATCH 3/5] megaraid_sas: do not crash on invalid completion

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/11/2016 12:51 PM, Sumit Saxena wrote:
-----Original Message-----
From: linux-scsi-owner@xxxxxxxxxxxxxxx [mailto:linux-scsi-
owner@xxxxxxxxxxxxxxx] On Behalf Of Hannes Reinecke
Sent: Friday, November 11, 2016 3:15 PM
To: Martin K. Petersen
Cc: Christoph Hellwig; James Bottomley; Sumit Saxena; linux-
scsi@xxxxxxxxxxxxxxx; Hannes Reinecke; Hannes Reinecke
Subject: [PATCH 3/5] megaraid_sas: do not crash on invalid completion

Avoid a kernel oops when receiving an invalid command completion.
scmd_local set to NULL(for cases MPI2_FUNCTION_SCSI_IO_REQUEST and
MEGASAS_MPI2_FUNCTION_LD_IO_REQUEST) will be serious bug(either in driver
or firmware) which should be debugged
and driver should not really continue beyond that. This indicates that
driver internal frames are corrupted. If needed, whenever driver detects
it, it can mark the adapter as dead(stopping further activities).
If OS is installed behind megasas controller then after declaring adapter
dead, system reboot will be required. Kernel panic may give here more
information whenever this condition hits so we kept it like this.
If you are facing this issue, please share the details. I will work on
this.


I have come across this problem when developing scsi-mq support. Due to the missing mmio barrier when writing to the inbound queue port the I/O submission became confused, resulting in already completed frames on the completion queue. While I do agree this is a pretty serious problem, the driver should _not_ crash; after all, it just received a completion for an unknown command. No reason to take the kernel down. I'd be in favour of resetting the HBA and taking it offline if required, but we really should not crash here.

Incidentally, we _will_ take the HBA offline even now, as these invalid command completions causes a scsi timeout for the original command, and as the HBA couldn't send completions for them the driver would eventually offline the card. So it worked as expected from my POV.

Cheers,

Hannes
--
Dr. Hannes Reinecke		      zSeries & Storage
hare@xxxxxxx			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux