On Thu, Oct 11, 2018 at 12:34:13PM -0600, Keith Busch wrote: > The aer_inject module had been intercepting config requests by overwriting > the config accessor operations in the pci_bus ops. This has several > issues. > > First, the module was tracking kernel objects unbeknownst to the drivers > that own them. The kernel may free those devices, leaving the AER inject > module holding stale references and no way to know that happened. > > Second, the PCI enumeration has child devices inherit pci_bus ops from > the parent bus. Since errors may lead to link resets that trigger > re-enumeration, the child devices would inherit operations that don't > know about the devices using them, causing kernel crashes. > > Finally, CONFIG_PCI_LOCKLESS_CONFIG doesn't block accessing the pci_bus > ops, so it's racing with potential in-flight config requests. > > This patch uses a different error injection approach leveraging ftrace > to thunk the config space functions. If the kernel and architecture > are capable, the ftrace hook will overwrite the processor's function > call address with the error injection function. This discreet error > injection doesn't modify or track driver structures, fixing the issues > with the current method. > > If either the kernel config or platform arch do not support the necessary > ftrace capabilities, the aer_inject module will fallback to the older > way so that it may continue to be used as before. I dropped this patch for now because the 0-day robot found something wrong.