On Thu, Oct 11, 2018 at 10:52:07AM -0600, Keith Busch wrote: > The aer_inject module had been intercepting config requests by overwriting > the config accessor operations in the pci_bus ops. This has several > issues. > > First, the module was tracking kernel objects unbeknownst to the drivers > that own them. The kernel may free those devices, leaving the AER inject > module holding stale references and no way to know that happened. > > Second, the PCI enumeration has child devices inherit pci_bus ops from > the parent bus. Since errors may lead to link resets that trigger > re-enumeration, the child devices would inherit operations that don't > know about the devices using them, causing kernel crashes. > > Finally, CONFIG_PCI_LOCKLESS_CONFIG doesn't block accessing the pci_bus > ops, so it's racing with potential in-flight config requests. > > This patch uses a different error injection approach leveraging ftrace > to thunk the config space functions. If the kernel and architecture > are capable, the ftrace hook will overwrite the processor's function > call address with the address of the error injecting function. This way > doesn't modify or track driver structures, fixing the issues with the > current method. > > If either the kernel config or platform arch do not support the necessary > ftrace capabilities, the aer_inject module will fallback to the older > way so that it may continue to be used as before. > > Signed-off-by: Keith Busch <keith.busch@xxxxxxxxx> > --- > drivers/pci/pcie/aer_inject.c | 218 ++++++++++++++++++++++++++++++++++++------ > 1 file changed, 189 insertions(+), 29 deletions(-) > > diff --git a/drivers/pci/pcie/aer_inject.c b/drivers/pci/pcie/aer_inject.c > index 726987f8d53c..4825fa465b60 100644 > --- a/drivers/pci/pcie/aer_inject.c > +++ b/drivers/pci/pcie/aer_inject.c > +static int aer_inj_legacy_read(struct pci_bus *bus, unsigned int devfn, > + int where, int size, u32 *val) > +{ > + struct pci_ops *ops, *my_ops; > + int rv; > + > + ops = __find_pci_bus_ops(bus); > + if (!ops) > + return -1; > + > + my_ops = bus->ops; > + bus->ops = ops; > + rv = ops->read(bus, devfn, where, size, val); > + bus->ops = my_ops; > + > + return rv; > +} > + > +static int aer_inj_legacy_write(struct pci_bus *bus, unsigned int devfn, > + int where, int size, u32 val) > +{ > + struct pci_ops *ops, *my_ops; > + int rv; > + > + ops = __find_pci_bus_ops(bus); > + if (!ops) > + return -1; > + > + my_ops = bus->ops; > + bus->ops = ops; > + rv = ops->write(bus, devfn, where, size, val); > + bus->ops = my_ops; > + > + return rv; > +} Is there any chance you could split this into two (or more) patches? The part above looks like pretty straightforward code movement without any functional change. It would be nice to get that part out of the way, then have a smaller patch that adds the fancy ftrace stuff. > static int aer_inj_read_config(struct pci_bus *bus, unsigned int devfn, > int where, int size, u32 *val) > { > u32 *sim; > struct aer_error *err; > unsigned long flags; > - struct pci_ops *ops; > - struct pci_ops *my_ops; > int domain; > int rv; > > @@ -204,18 +244,7 @@ static int aer_inj_read_config(struct pci_bus *bus, unsigned int devfn, > return 0; > } > out: > - ops = __find_pci_bus_ops(bus); > - /* > - * pci_lock must already be held, so we can directly > - * manipulate bus->ops. Many config access functions, > - * including pci_generic_config_read() require the original > - * bus->ops be installed to function, so temporarily put them > - * back. > - */ > - my_ops = bus->ops; > - bus->ops = ops; > - rv = ops->read(bus, devfn, where, size, val); > - bus->ops = my_ops; > + rv = aer_inj_legacy_read(bus, devfn, where, size, val); > spin_unlock_irqrestore(&inject_lock, flags); > return rv; > } > @@ -227,8 +256,6 @@ static int aer_inj_write_config(struct pci_bus *bus, unsigned int devfn, > struct aer_error *err; > unsigned long flags; > int rw1cs; > - struct pci_ops *ops; > - struct pci_ops *my_ops; > int domain; > int rv; > > @@ -252,18 +279,7 @@ static int aer_inj_write_config(struct pci_bus *bus, unsigned int devfn, > return 0; > } > out: > - ops = __find_pci_bus_ops(bus); > - /* > - * pci_lock must already be held, so we can directly > - * manipulate bus->ops. Many config access functions, > - * including pci_generic_config_write() require the original > - * bus->ops be installed to function, so temporarily put them > - * back. > - */ > - my_ops = bus->ops; > - bus->ops = ops; > - rv = ops->write(bus, devfn, where, size, val); > - bus->ops = my_ops; > + rv = aer_inj_legacy_write(bus, devfn, where, size, val); > spin_unlock_irqrestore(&inject_lock, flags); > return rv; > }