Depends: commit a1140e7bcb10ff96c192ee200e6cbf832f27158e ("PCI: fix serious bug when sizing bridges with additional size") Background: I have come to find that the kernel parameters for reserving resources for the hotplug bridges are not useful for Thunderbolt with native PCI enumeration. You can only increase the size so far before it fails to allocate the 32-bit MMIO windows. Nature of problem: pci=hpmemsize=nnM is parsed from drivers/pci/pci.c and used in drivers/pci/setup-bus.c. It overwrites pci_hotplug_mem_size which has a default value set by DEFAULT_HOTPLUG_MEM_SIZE. When used, this value is used to request additional space for MMIO and MMIO_PREF in equal amounts. The fallout: if you increase the value of pci=hpmemsize=nnM to be too large to fit into the 32-bit address space, the MMIO window fails to allocate and has zero-sized BARs. In the case of Thunderbolt, this means the NHI does not have the resources needed to function, and the thunderbolt.ko driver fails to probe the device. All Thunderbolt functionality is lost. Even if the NHI worked, any Thunderbolt device containing endpoints with 32-bit BARs (most of them) would fail to initialise properly. The worst case happens if some resources end up allocating with non-zero size, but too small. With some AMD Radeon external GPUs, if certain combinations of BARs are assigned / failed to assign, it can cause BUG_ON from arch/x86/mm/pat.c due to the buggy amdgpu.ko driver trying to use a zero-sized BAR, instead of failing gracefully. This more or less renders the system unusable. The cause: because the kernel parameters do not allow the user to specify the MMIO and MMIO_PREF hotplug bridge additional sizes separately, they have no choice but to give unrealistic sizes that will fail, or will be forced to make unpleasant compromises, resulting in having to tolerate unstable or unpredictable operation. The solution: My proposed patch depreciates pci=hpmemsize=nnM,hpiosize=nn but allows them to work without any change in user-visible functionality, with a KERN_WARNING message when either are used. It adds the new parameters pci=hp_io_size=nn,hp_mmio_size=nnM,hp_mmio_pref_size=nnM. If any of the new kernel parameters are used, all of the newly-depreciated ones will be overridden. This will allow for a grace period to allow people to change to the new kernel parameters without unpleasant disruption. At a time deemed appropriate by kernel developers, the old parameters can be easily removed without the need to rework any code. My testing: This works very well. I have not encountered any problems and I am happily allocating 128M/64G under each Thunderbolt port with nocrs. The success is dependent on my previous bug patch, which I decided to separate from this feature patch. The old functionality still works the same, and is overridden by the new commands if they are used. Remarks: The printk at the end of pci_setup() in drivers/pci/pci.c has a strange backquote ('`', under the tilde on the keyboard key) in the format. I am not sure if it is a typo, or deliberate, and whether it should change or if my KERN_WARNING logs need to use that symbol. Also, the scripts/checkpatch.pl berates me for using printk, but the alternative, pci_warn(), requires a pci_dev to work, and this message does not pertain to a particular device. Signed-off-by: Nicholas Johnson <nicholas.johnson-opensource@xxxxxxxxxxxxxx> --- .../admin-guide/kernel-parameters.txt | 21 ++++++++-- drivers/pci/pci.c | 39 +++++++++++++++++-- drivers/pci/setup-bus.c | 26 +++++++------ include/linux/pci.h | 3 +- 4 files changed, 69 insertions(+), 20 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index b799bcf67..284752ff7 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3327,12 +3327,27 @@ the default. off: Turn ECRC off on: Turn ECRC on. - hpiosize=nn[KMG] The fixed amount of bus space which is + + hpiosize=nn[KMG] Depreciated. Overridden by hp_io_size + value if any of hp_io_size, hp_mmio_size + or hp_mmio_pref_size are used. This sets + hp_io_size to the given value if not + overridden. + + hpmemsize=nn[KMG] Depreciated. Overridden if any of + hp_io_size, hp_mmio_size or + hp_mmio_pref_size are used. This sets + hp_mmio_size and hp_mmio_pref_size to + the given value if not overridden. + hp_io_size=nn[KMG] The fixed amount of bus space which is reserved for hotplug bridge's IO window. Default size is 256 bytes. - hpmemsize=nn[KMG] The fixed amount of bus space which is - reserved for hotplug bridge's memory window. + hp_mmio_size=nn[KMG] The fixed amount of bus space which is + reserved for hotplug bridge's 32-bit mmio window. Default size is 2 megabytes. + hp_mmio_pref_size=nn[KMG] The fixed amount of bus space + which is reserved for hotplug bridge's 64-bit + mmio_pref window. Default size is 2 megabytes. hpbussize=nn The minimum amount of additional bus numbers reserved for buses below a hotplug bridge. Default is 1. diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index c25acace7..64978bf84 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -85,12 +85,15 @@ unsigned long pci_cardbus_io_size = DEFAULT_CARDBUS_IO_SIZE; unsigned long pci_cardbus_mem_size = DEFAULT_CARDBUS_MEM_SIZE; #define DEFAULT_HOTPLUG_IO_SIZE (256) -#define DEFAULT_HOTPLUG_MEM_SIZE (2*1024*1024) -/* pci=hpmemsize=nnM,hpiosize=nn can override this */ +#define DEFAULT_HOTPLUG_MMIO_SIZE (2*1024*1024) +#define DEFAULT_HOTPLUG_MMIO_PREF_SIZE (2*1024*1024) +/* Override with pci=hp_io_size=nn,hp_mmio_size=nnM,hp_mmio_pref_size=nnM */ unsigned long pci_hotplug_io_size = DEFAULT_HOTPLUG_IO_SIZE; -unsigned long pci_hotplug_mem_size = DEFAULT_HOTPLUG_MEM_SIZE; +unsigned long pci_hotplug_mmio_size = DEFAULT_HOTPLUG_MMIO_SIZE; +unsigned long pci_hotplug_mmio_pref_size = DEFAULT_HOTPLUG_MMIO_PREF_SIZE; #define DEFAULT_HOTPLUG_BUS_SIZE 1 +/* Override with pci=hpbussize=nn,assign-busses */ unsigned long pci_hotplug_bus_size = DEFAULT_HOTPLUG_BUS_SIZE; enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_DEFAULT; @@ -6144,6 +6147,15 @@ EXPORT_SYMBOL(pci_fixup_cardbus); static int __init pci_setup(char *str) { + /* + * Depreciated pci=hpmemsize=nnM but keep the functionality for now. + * If none of hp_io_size, hp_mmio_size or hp_mmio_pref_size are set + * then override hp_mmio_size and hp_mmio_pref_size with hpmemsize. + */ + unsigned long pci_hotplug_io_size_old = DEFAULT_HOTPLUG_IO_SIZE; + unsigned long pci_hotplug_mem_size = DEFAULT_HOTPLUG_MMIO_SIZE; + bool use_new_pci_hotplug_params = 0; + while (str) { char *k = strchr(str, ','); if (k) @@ -6176,9 +6188,23 @@ static int __init pci_setup(char *str) } else if (!strncmp(str, "ecrc=", 5)) { pcie_ecrc_get_policy(str + 5); } else if (!strncmp(str, "hpiosize=", 9)) { - pci_hotplug_io_size = memparse(str + 9, &str); + printk(KERN_WARNING "PCI: Depreciated option 'hpiosize', use 'hp_io_size' instead\n"); + pci_hotplug_io_size_old = + memparse(str + 9, &str); } else if (!strncmp(str, "hpmemsize=", 10)) { + printk(KERN_WARNING "PCI: Depreciated option 'hpmemsize', use 'hp_mmio_size' and 'hp_mmio_pref_size' instead\n"); pci_hotplug_mem_size = memparse(str + 10, &str); + } else if (!strncmp(str, "hp_io_size=", 11)) { + use_new_pci_hotplug_params = 1; + pci_hotplug_io_size = memparse(str + 11, &str); + } else if (!strncmp(str, "hp_mmio_size=", 13)) { + use_new_pci_hotplug_params = 1; + pci_hotplug_mmio_size = + memparse(str + 13, &str); + } else if (!strncmp(str, "hp_mmio_pref_size=", 18)) { + use_new_pci_hotplug_params = 1; + pci_hotplug_mmio_pref_size = + memparse(str + 18, &str); } else if (!strncmp(str, "hpbussize=", 10)) { pci_hotplug_bus_size = simple_strtoul(str + 10, &str, 0); @@ -6204,6 +6230,11 @@ static int __init pci_setup(char *str) } str = k; } + if (!use_new_pci_hotplug_params) { + pci_hotplug_io_size = pci_hotplug_io_size_old; + pci_hotplug_mmio_size = pci_hotplug_mem_size; + pci_hotplug_mmio_pref_size = pci_hotplug_mem_size; + } return 0; } early_param("pci", pci_setup); diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c index c7e0a5e2b..e4cdc6484 100644 --- a/drivers/pci/setup-bus.c +++ b/drivers/pci/setup-bus.c @@ -1234,7 +1234,8 @@ void __pci_bus_size_bridges(struct pci_bus *bus, struct list_head *realloc_head) { struct pci_dev *dev; unsigned long mask, prefmask, type2 = 0, type3 = 0; - resource_size_t additional_mem_size = 0, additional_io_size = 0; + resource_size_t additional_io_size = 0, additional_mmio_size = 0, + additional_mmio_pref_size = 0; struct resource *b_res; int ret; @@ -1267,8 +1268,9 @@ void __pci_bus_size_bridges(struct pci_bus *bus, struct list_head *realloc_head) case PCI_CLASS_BRIDGE_PCI: pci_bridge_check_ranges(bus); if (bus->self->is_hotplug_bridge) { - additional_io_size = pci_hotplug_io_size; - additional_mem_size = pci_hotplug_mem_size; + additional_io_size = pci_hotplug_io_size; + additional_mmio_size = pci_hotplug_mmio_size; + additional_mmio_pref_size = pci_hotplug_mmio_pref_size; } /* Fall through */ default: @@ -1287,9 +1289,8 @@ void __pci_bus_size_bridges(struct pci_bus *bus, struct list_head *realloc_head) prefmask |= IORESOURCE_MEM_64; ret = pbus_size_mem(bus, prefmask, prefmask, prefmask, prefmask, - realloc_head ? 0 : additional_mem_size, - additional_mem_size, realloc_head); - + realloc_head ? 0 : additional_mmio_pref_size, + additional_mmio_pref_size, realloc_head); /* * If successful, all non-prefetchable resources * and any 32-bit prefetchable resources will go in @@ -1310,9 +1311,9 @@ void __pci_bus_size_bridges(struct pci_bus *bus, struct list_head *realloc_head) if (!type2) { prefmask &= ~IORESOURCE_MEM_64; ret = pbus_size_mem(bus, prefmask, prefmask, - prefmask, prefmask, - realloc_head ? 0 : additional_mem_size, - additional_mem_size, realloc_head); + prefmask, prefmask, + realloc_head ? 0 : additional_mmio_pref_size, + additional_mmio_pref_size, realloc_head); /* * If successful, only non-prefetchable resources @@ -1321,7 +1322,8 @@ void __pci_bus_size_bridges(struct pci_bus *bus, struct list_head *realloc_head) if (ret == 0) mask = prefmask; else - additional_mem_size += additional_mem_size; + additional_mmio_size += + additional_mmio_pref_size; type2 = type3 = IORESOURCE_MEM; } @@ -1342,8 +1344,8 @@ void __pci_bus_size_bridges(struct pci_bus *bus, struct list_head *realloc_head) * window. */ pbus_size_mem(bus, mask, IORESOURCE_MEM, type2, type3, - realloc_head ? 0 : additional_mem_size, - additional_mem_size, realloc_head); + realloc_head ? 0 : additional_mmio_size, + additional_mmio_size, realloc_head); break; } } diff --git a/include/linux/pci.h b/include/linux/pci.h index 65f1d8c2f..b30d55697 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1959,7 +1959,8 @@ extern u8 pci_dfl_cache_line_size; extern u8 pci_cache_line_size; extern unsigned long pci_hotplug_io_size; -extern unsigned long pci_hotplug_mem_size; +extern unsigned long pci_hotplug_mmio_size; +extern unsigned long pci_hotplug_mmio_pref_size; extern unsigned long pci_hotplug_bus_size; /* Architecture-specific versions may override these (weak) */ -- 2.19.1