I got the cold chills when I realized you called for a delay of 350ms. It's because 350ms is around the delay I've observed to be caused by FFS. First run KASANed with the extra delay, so hopefully, I'll have more cement test results by EOB today. Alex -----Original Message----- From: Scott Bauer [mailto:scott.bauer@xxxxxxxxx] Sent: Thursday, April 12, 2018 11:47 AM To: Gagniuc, Alexandru - Dell Team Cc: keith.busch@xxxxxxxxx; linux-pci@xxxxxxxxxxxxxxx; bhelgaas@xxxxxxxxxx Subject: Re: [PATCH 0/4] PCI/AER: Use-after-free fix On Thu, Apr 12, 2018 at 05:06:05PM +0000, Alex_Gagniuc@xxxxxxxxxxxx wrote: > From: Keith Busch [mailto:keith.busch@xxxxxxxxx] > > > AER error handling walks the PCI topology below a root port, saving pointers of the pci_dev structs affected by the error along the way. > > Hi Keith, > > I've been trying to do an ABA test to confirm that your change eliminates the use-after-free issue we've seen. The race seems to be quite elusive, so I can't reliably reproduce it. Your changes have not been forgotten; I have them staged for further testing. > > Alex If you need help triggering the race you can add a sleep/microsleep here: aer_isr_one_error() between the find_source_device and process err device: sbauer@sbauer-Z170X-UD5:~/nvme_code/upstream_jens/linux-block$ git diff drivers/pci/pcie/aer/aerdrv_core.c diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c index a4bfea52e7d4..5ca0c07b1d05 100644 --- a/drivers/pci/pcie/aer/aerdrv_core.c +++ b/drivers/pci/pcie/aer/aerdrv_core.c @@ -22,6 +22,7 @@ #include <linux/delay.h> #include <linux/slab.h> #include <linux/kfifo.h> +#include <linux/delay.h> #include "aerdrv.h" #define PCI_EXP_AER_FLAGS (PCI_EXP_DEVCTL_CERE | PCI_EXP_DEVCTL_NFERE | \ @@ -740,8 +741,10 @@ static void aer_isr_one_error(struct pcie_device *p_device, aer_print_port_info(p_device->port, e_info); - if (find_source_device(p_device->port, e_info)) + if (find_source_device(p_device->port, e_info)) { + msleep(350); aer_process_err_devices(p_device, e_info); + } } if (e_src->status & PCI_ERR_ROOT_UNCOR_RCV) { @@ -759,8 +762,10 @@ static void aer_isr_one_error(struct pcie_device *p_device, aer_print_port_info(p_device->port, e_info); - if (find_source_device(p_device->port, e_info)) + if (find_source_device(p_device->port, e_info)) { + msleep(350); aer_process_err_devices(p_device, e_info); + } } }