Re: [PATCH 5/8] PCI/AER: Introduce ratelimit for AER IRQs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 25/01/2025 08:39, Lukas Wunner wrote:

Masking errors at the register level feels overzealous,
in particular because it also disables logging via tracepoints.

Is there a concrete device that necessitates this change?

I faced issues with excessive Correctable Errors reporting with Samsung PM1733 NVMe (a couple of thousand errors per hour), which were still polluting the logs even after introducing a ratelimit (first every 2s, second ever 30s, as proposed in [1]). Also, instead of masking the errors automatically, we could give a user a sysfs knob to turn error generation off and on.

If there is, consider adding a quirk for this particular device
which masks specific errors, but doesn't affect other devices.

There were many other reports of Correctable Error floods, as signaled in the cover letter, so it's hard to pinpoint the specific driver that should mask these errors.

All the best,
Karolina

-------------------------------------
[1] - https://lore.kernel.org/linux-pci/cover.1736341506.git.karolina.stolarek@xxxxxxxxxx/

If there isn't, consider dropping this change until a buggy device
appears that actually needs it.

Thanks,

Lukas





[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux