Re: [PATCH v4 5/7] PCI/AER: Introduce ratelimit for error logs

Karolina Stolarek <karolina.stolarek@xxxxxxxxxx> · Thu, 20 Mar 2025 15:56:53 +0100

On 20/03/2025 09:20, Jon Pan-Doh wrote:
Spammy devices can flood kernel logs with AER errors and slow/stall 
execution. Add per-device ratelimits for AER correctable and 
uncorrectable errors that use the kernel defaults (10 per 5s).

Tested using aer-inject[1]. Sent 11 AER errors. Observed 10 errors 
logged while AER stats (cat /sys/bus/pci/devices/<dev>/ 
aer_dev_correctable) show true count of 11.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/gong.chen/aer- 
inject.git
>
Signed-off-by: Jon Pan-Doh <pandoh@xxxxxxxxxx>
Reviewed-by: Karolina Stolarek <karolina.stolarek@xxxxxxxxxx>

For future reference -- please drop r-bs from patches that have 
functional/bigger changes. New code nullifies previous reviews.

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 3069376b3553..081cef5fc678 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -88,6 +89,10 @@ struct aer_report {
  	u64 rootport_total_cor_errs;
  	u64 rootport_total_fatal_errs;
  	u64 rootport_total_nonfatal_errs;
+
+	/* Ratelimits for errors */
+	struct ratelimit_state cor_log_ratelimit;
+	struct ratelimit_state uncor_log_ratelimit;
  };
  
  #define AER_LOG_TLP_MASKS		(PCI_ERR_UNC_POISON_TLP|	\
@@ -379,6 +384,15 @@ void pci_aer_init(struct pci_dev *dev)
  
  	dev->aer_report = kzalloc(sizeof(*dev->aer_report), GFP_KERNEL);
  
+	/*
+	 * Ratelimits are doubled as a given error produces 2 logs (root port
+	 * and endpoint) that should be under same ratelimit.
+	 */

It took me a bit to understand what this comment is about.

When we handle an error message, we first use the source's ratelimit to 
decide if we want to print the port info, and then the actual error. In 
theory, there could be more errors of the same class coming from other 
devices within one message. For these devices, we would call the 
ratelimit just once. I don't have a nice an clean solution for this 
problem, I just wanted to highlight that 1) we don't use the Root Port's 
ratelimit in aer_print_port_info(), 2) we may use the bursts to either 
print port_info + error message or just the message, in different 
combinations. I think we should reword this comment to highlight the 
fact that we don't check the ratelimit once per error, we could do it twice.

Also, I wonder -- do only Endpoints generate error messages? From what I 
understand, that some errors can be detected by intermediary devices.

+	ratelimit_state_init(&dev->aer_report->cor_log_ratelimit,
+			     DEFAULT_RATELIMIT_INTERVAL, DEFAULT_RATELIMIT_BURST * 2);
+	ratelimit_state_init(&dev->aer_report->uncor_log_ratelimit,
+			     DEFAULT_RATELIMIT_INTERVAL, DEFAULT_RATELIMIT_BURST * 2);
+
  	/*
  	 * We save/restore PCI_ERR_UNCOR_MASK, PCI_ERR_UNCOR_SEVER,
  	 * PCI_ERR_COR_MASK, and PCI_ERR_CAP.  Root and Root Complex Event
@@ -668,6 +682,17 @@ static void pci_rootport_aer_stats_incr(struct pci_dev *pdev,
  	}
  }
  
+static int aer_ratelimit(struct pci_dev *dev, unsigned int severity)

I really like this solution, it's nice and tidy


  static void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info)
  {

I'm also thinking -- we are ratelimiting the aer_print_port_info() and 
aer_print_error(). What about the messages in dpc_process_error()? 
Should we check early if DPC was triggered because of an uncorrectable 
error, and if so, ratelimit that?

All the best,
Karolina