[PATCH RESEND 0/4] Rate limit reporting of Correctable Errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 8 Jan 2025 13:55:30 +0000
Karolina Stolarek <karolina.stolarek@xxxxxxxxxx> wrote:
> TL;DR
> ====
> 
> We are getting multiple reports about excessive logging of Correctable
> Errors with no clear common root cause. As these errors are already
> corrected by hardware, it makes sense to limit them. Introduce
> a ratelimit state definition to pci_dev to control the number of
> messages reported by a Root Port within a specified time interval.
> The series adds other improvements in the area, as outlined in the
> Proposal section.

Hi Karolina,

This is a common impediment for many folks that want to enable AER. The
excessive logging stalls execution, making machines unusable. I've been
working on a similar solution[1] to yours (i.e. ratelimiting) with a few
differences:

- ratelimit uncorrectable errors
- ratelimit IRQs
- configure ratelimits from userspace (sysfs knobs)

Hoping we can collaborate on a solution (i.e. take best parts of both patch
series).

Thanks,
Jon

[1] https://lore.kernel.org/linux-pci/20250115074301.3514927-1-pandoh@xxxxxxxxxx/




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux