On Sat, Sep 7, 2019 at 2:25 AM Sergey Miroshnichenko <s.miroshnichenko@xxxxxxxxx> wrote: > > Hi Oliver, > > On 9/4/19 8:37 AM, Oliver O'Halloran wrote: > > On Fri, 2019-08-16 at 19:50 +0300, Sergey Miroshnichenko wrote: > >> Add pcibios_rescan_prepare()/_done() hooks for the powerpc platform. Now if > >> the device's driver supports movable BARs, pcibios_rescan_prepare() will be > >> called after the device is stopped, and pcibios_rescan_done() - before it > >> resumes. There are no memory requests to this device between the hooks, so > >> it it safe to rebuild the EEH address cache during that. > >> > >> CC: Oliver O'Halloran <oohall@xxxxxxxxx> > >> Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@xxxxxxxxx> > >> --- > >> arch/powerpc/kernel/pci-hotplug.c | 10 ++++++++++ > >> 1 file changed, 10 insertions(+) > >> > >> diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c > >> index 0b0cf8168b47..18cf13bba228 100644 > >> --- a/arch/powerpc/kernel/pci-hotplug.c > >> +++ b/arch/powerpc/kernel/pci-hotplug.c > >> @@ -144,3 +144,13 @@ void pci_hp_add_devices(struct pci_bus *bus) > >> pcibios_finish_adding_to_bus(bus); > >> } > >> EXPORT_SYMBOL_GPL(pci_hp_add_devices); > >> + > >> +void pcibios_rescan_prepare(struct pci_dev *pdev) > >> +{ > >> + eeh_addr_cache_rmv_dev(pdev); > >> +} > >> + > >> +void pcibios_rescan_done(struct pci_dev *pdev) > >> +{ > >> + eeh_addr_cache_insert_dev(pdev); > >> +} > > > > Is this actually sufficent? The PE number for a device is largely > > determined by the location of the MMIO BARs. If you move a BAR far > > enough the PE number stored in the eeh_pe would need to be updated as > > well. > > > > Thanks for the hint! I've checked on our PowerNV: for bridges with MEM > only it allocates PE numbers starting from 0xff down, and when there > are MEM64 - starting from 0 up, one PE number per 4GiB. > > PEs are allocated during call to pnv_pci_setup_bridge(), and the I've > added invocation of pci_setup_bridge() after a hotplug event in the > "Recalculate all bridge windows during rescan" patch of this series. Sort of. On PHB3 both the 32bit and the 64bit MMIO windows are split into 256 segments each of which is mapped to a PE number. For the 32bit space there's a remapping table in hardware that allows arbitrary mapping of segments to PE numbers, but in the 64bit space the mapping is fixed with the first segment being PE0, etc. If there's a 64 bit BAR under a bridge the PE is really "allocated" during the BAR assignment process, and the setup_bridge() step sets up the EEH state based on that. It's worth pointing out that this is why the 64bit window is usually 4GB. Bridge windows need to be aligned to a segment boundary to ensure the devices under them are placed into a unique PE. > Currently, if a bus already has a PE, pnv_ioda_setup_bus_PE() takes it > and returns. I can see two ways to change it, both are not difficult to > implement: > > a.1) check if MEM64 BARs appeared below the bus - allocate and assign > a new master PE with required number of slave PEs; > > a.2) if the bus now has more MEM64 than before - check if more slave > PEs must be reserved; > > b) release all the PEs before a PCI rescan and allocate+assign them > again after - with this approach the "Hook up the writes to > PCI_SECONDARY_BUS register" patch may be eliminated. > > Do you find any of these suitable? I'm not sure a) would work, but even if it does b) is preferable. There's a lot of strangeness in the powerpc PCI code as-is without adding extra code paths to deal with. Keeping what happens at hotplug consistent with what happens at boot will help keep things sane. FYI in the next few days I'm going to post a series that rips out the use of pci_dn in powernv and the generic parts of EEH (pseries still uses it). Assuming Bjorn isn't picking this up for 5.4 you might want to wait for that before getting too deep into this. Oliver