On Thu, Apr 12, 2018 at 05:06:05PM +0000, Alex_Gagniuc@xxxxxxxxxxxx wrote: > From: Keith Busch [mailto:keith.busch@xxxxxxxxx] > > > AER error handling walks the PCI topology below a root port, saving pointers of the pci_dev structs affected by the error along the way. > > Hi Keith, > > I've been trying to do an ABA test to confirm that your change eliminates the use-after-free issue we've seen. The race seems to be quite elusive, so I can't reliably reproduce it. Your changes have not been forgotten; I have them staged for further testing. > > Alex If you need help triggering the race you can add a sleep/microsleep here: aer_isr_one_error() between the find_source_device and process err device: sbauer@sbauer-Z170X-UD5:~/nvme_code/upstream_jens/linux-block$ git diff drivers/pci/pcie/aer/aerdrv_core.c diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c index a4bfea52e7d4..5ca0c07b1d05 100644 --- a/drivers/pci/pcie/aer/aerdrv_core.c +++ b/drivers/pci/pcie/aer/aerdrv_core.c @@ -22,6 +22,7 @@ #include <linux/delay.h> #include <linux/slab.h> #include <linux/kfifo.h> +#include <linux/delay.h> #include "aerdrv.h" #define PCI_EXP_AER_FLAGS (PCI_EXP_DEVCTL_CERE | PCI_EXP_DEVCTL_NFERE | \ @@ -740,8 +741,10 @@ static void aer_isr_one_error(struct pcie_device *p_device, aer_print_port_info(p_device->port, e_info); - if (find_source_device(p_device->port, e_info)) + if (find_source_device(p_device->port, e_info)) { + msleep(350); aer_process_err_devices(p_device, e_info); + } } if (e_src->status & PCI_ERR_ROOT_UNCOR_RCV) { @@ -759,8 +762,10 @@ static void aer_isr_one_error(struct pcie_device *p_device, aer_print_port_info(p_device->port, e_info); - if (find_source_device(p_device->port, e_info)) + if (find_source_device(p_device->port, e_info)) { + msleep(350); aer_process_err_devices(p_device, e_info); + } } }