[+cc Fred] On Thu, Aug 09, 2018 at 09:25:20AM +0200, Patrick Talbert wrote: > Hello, > > I think there may be a logic issue with ASPM but I am not so familiar > with this area so maybe I misunderstand. > > acpi_pci_init() checks ACPI FADT method for the ACPI_FADT_NO_ASPM flag > and will disable ASPM support if it is set. It calls pcie_no_aspm() to > do this. pcie_no_aspm() sets the global aspm_disabled = 1 to indicate > this state: > > 365 static int __init acpi_pci_init(void) > 366 { > 367 int ret; > 368 > 369 if (acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_MSI) { > 370 pr_info("ACPI FADT declares the system doesn't > support MSI, so disable it\n"); > 371 pci_no_msi(); > 372 } > 373 > 374 if (acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_ASPM) { > 375 pr_info("ACPI FADT declares the system doesn't > support PCIe ASPM, so disable it\n"); > 376 pcie_no_aspm(); > 377 } > 378 > 379 ret = register_acpi_bus_type(&acpi_pci_bus); > 380 if (ret) > 381 return 0; > 382 > 383 pci_set_platform_pm(&acpi_pci_platform_pm); > 384 acpi_pci_slot_init(); > 385 acpiphp_init(); > 386 > 387 return 0; > 388 } > 389 arch_initcall(acpi_pci_init); > > 973 void pcie_no_aspm(void) > 974 { > 975 /* > 976 * Disabling ASPM is intended to prevent the kernel from modifying > 977 * existing hardware state, not to clear existing state. > To that end: > 978 * (a) set policy to POLICY_DEFAULT in order to avoid changing state > 979 * (b) prevent userspace from changing policy > 980 */ > 981 if (!aspm_force) { > 982 aspm_policy = POLICY_DEFAULT; > 983 aspm_disabled = 1; > 984 } > 985 } > > But then later on, during individual initialization of PCI devices, > pci_scan_slot() will call pcie_aspm_init_link_state(). > pcie_aspm_init_link_state() does its work unless global > aspm_support_enabled is false. > > But, shouldn't pcie_aspm_init_link_state() also check aspm_disabled ? > Something like: > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > index 8b56dad..5491dbf 100644 > --- a/drivers/pci/pcie/aspm.c > +++ b/drivers/pci/pcie/aspm.c > @@ -574,7 +574,7 @@ void pcie_aspm_init_link_state(struct pci_dev *pdev) > struct pcie_link_state *link; > int blacklist = !!pcie_aspm_sanity_check(pdev); > > - if (!aspm_support_enabled) > + if (!aspm_support_enabled || aspm_disabled) > return; > > if (pdev->link_state) > > > In "387d375 PCI: Don't clear ASPM bits when the FADT declares it's > unsupported" the message says: > > Communications with a hardware vendor confirm that the expected behaviour > on systems that set the FADT ASPM disable bit but which still grant full > PCIe control is for the OS to leave any BIOS configuration intact and > refuse to touch the ASPM bits. This mimics the behaviour of Windows. > > > I apologize, but I do not immediately understand the distinction > between the meaning of aspm_support_enabled and aspm_disabled so I am > likely missing some key point. Don't apologize, this code confuses everybody who looks at it, which is a clear signal that it needs some love :) > But anyway, this query comes about because a host which reports the > ACPI_FADT_NO_ASPM bit set hangs on boot unless pcie_aspm=off is set. That's a terrible situation and very difficult to debug. How did you figure out that this was related to ASPM? Can you tell anything about where the hang is? Is there any chance you could collect a complete dmesg log and output of "sudo lspci -vvv" from that system (obviously you'd have to boot with "pcie_aspm=off" or your patch, so we could attach it to a bugzilla.kernel.org report that could be mentioned in the patch? I assume this system works with Windows. I agree that based on the 387d37577fdd changelog, we probably shouldn't be doing anything in pcie_aspm_init_link_state(). Bjorn