RE: [PATCH 1/1] PCI/AER: Ignore correctable error reports for SN730 WD SSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>From: Keith Busch <kbusch@xxxxxxxxxx> 
>Sent: Tuesday, January 17, 2023 5:55 PM
>To: Alexey Bogoslavsky <Alexey.Bogoslavsky@xxxxxxx>
>Cc: linux-pci@xxxxxxxxxxxxxxx; bhelgaas@xxxxxxxxxx; 'hch@xxxxxx' <hch@xxxxxx>
>Subject: Re: [PATCH 1/1] PCI/AER: Ignore correctable error reports for SN730 WD SSD

>On Mon, Jan 16, 2023 at 06:32:54PM +0000, Alexey Bogoslavsky wrote:
>> From: Alexey Bogoslavsky <mailto:Alexey.Bogoslavsky@xxxxxxx>
>>
>> A bug was found in SN730 WD SSD that causes occasional false AER reporting
>> of correctable errors. While functionally harmless, this causes error
>> messages to appear in the system log (dmesg) which, in turn, causes
>> problems in automated platform validation tests. Since the issue can not
>> be fixed by FW, customers asked for correctable error reporting to be
>> quirked out in the kernel for this particular device.
>
>> The patch was manually verified. It was checked that correctable errors
>> are still detected but ignored for the target device (SN730), and are both
>> detected and reported for devices not affected by this quirk.

>If you're just going to have the kernel ignore these, are you not able
>to suppress the ERR_COR message at the source? Have the following
>options been tried?

> a. Disabling Correctable Error Reporting Enable in Device Control
>    Register; i.e. mask out PCI_EXP_DEVCTL_CERE.
> b. Setting AER Correctable Error Mask Register to all 1's

>I think it's usually possible for firmware to hardwire these. If the
I believe these options were discussed but deemed non-viable. I'll double check anyway
>If firmware can't do that, quirking the kernel to always disable reporting
>sounds like a better option. If either of the above fail to suppress the
>error messages, then I guess having the kernel ignore it is the only
>option.
This could probably work. I'll discuss this with our FW team to make sure the issue can
be resolved this way. Thank you




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux