On Thu, Oct 11, 2018 at 02:41:46PM -0500, Bjorn Helgaas wrote: > On Thu, Oct 11, 2018 at 12:34:13PM -0600, Keith Busch wrote: > > The aer_inject module had been intercepting config requests by overwriting > > the config accessor operations in the pci_bus ops. This has several > > issues. > > > > First, the module was tracking kernel objects unbeknownst to the drivers > > that own them. The kernel may free those devices, leaving the AER inject > > module holding stale references and no way to know that happened. > > > > Second, the PCI enumeration has child devices inherit pci_bus ops from > > the parent bus. Since errors may lead to link resets that trigger > > re-enumeration, the child devices would inherit operations that don't > > know about the devices using them, causing kernel crashes. > > > > Finally, CONFIG_PCI_LOCKLESS_CONFIG doesn't block accessing the pci_bus > > ops, so it's racing with potential in-flight config requests. > > > > This patch uses a different error injection approach leveraging ftrace > > to thunk the config space functions. If the kernel and architecture > > are capable, the ftrace hook will overwrite the processor's function > > call address with the error injection function. This discreet error > > injection doesn't modify or track driver structures, fixing the issues > > with the current method. > > > > If either the kernel config or platform arch do not support the necessary > > ftrace capabilities, the aer_inject module will fallback to the older > > way so that it may continue to be used as before. > > I dropped this patch for now because the 0-day robot found something wrong. I just saw that. Sorry for the trouble. It fails a minimal kernel config, so I missed checking appropriate config defines.