On Mon, Apr 27, 2020 at 09:23:34PM +0300, Sergei Miroshnichenko wrote: > Currently PCI hotplug works on top of resources which are usually reserved > not by the kernel, but by BIOS, bootloader, firmware, etc. These resources > are gaps in the address space where BARs of new devices may fit, and extra > bus number per port, so bridges can be hot-added. This series aim the BARs > problem: it shows the kernel how to redistribute them on the run, so the > hotplug becomes predictable and cross-platform. A follow-up patchset will > propose a solution for bus numbers. > > To arrange a space for BARs of new hotplugged devices, the kernel can pause > the drivers of working PCI devices and reshuffle the assigned BARs. When a > driver is un-paused by the kernel, it should ioremap() the new addresses of > its BARs. > > Drivers indicate their support of the feature by implementing the new hooks > .rescan_prepare() and .rescan_done() in the struct pci_driver. If a driver > doesn't yet support the feature, BARs of its devices will be considered as > immovable and handled in the same way as resources with the PCI_FIXED flag: > they are guaranteed to remain untouched. > > Tested on a number of x86_64 machines without any special kernel command > line arguments: > - PC: i7-5930K + ASUS X99-A; > - PC: i5-8500 + ASUS Z370-F; > - Supermicro Super Server/H11SSL-i: AMD EPYC 7251; > - HP ProLiant DL380 G5: Xeon X5460; > - Dell Inspiron N5010: i5 M 480; > - Dell Precision M6600: i7-2920XM. > ... There's a lot of good work here, and I apologize that we haven't made much progress on merging it. I suspect this will become more and more important with Thunderbolt. It does touch a lot of the ugliest and least maintainable code under drivers/pci, which is *good* if we can clean it up a little bit in the process, but it is also risky. I expect that a few problems are inevitable because of BIOS issues, driver issues, and devices that can't tolerate their BARs being moved. We've tripped over a few of those devices in the past. Those can be really hard to debug and fix since we won't have the hardware in question. To make them tractable, I think we will really need some way to test at least the resource assignment pieces of this "in vitro" without needing the actual hardware. E.g., maybe we could add enough diagnostics so that a dmesg log would contain all the information needed to reproduce a PCI hierarchy, the initial resource assignments, and subsequent hotplug events in some sort of test fixture, maybe a qemu boot or similar. Bjorn