On Fri, Jul 15, 2022 at 9:40 AM Manyi Li <limanyi@xxxxxxxxxxxxx> wrote: > > > > On 2022/7/14 11:20, Kai-Heng Feng wrote: > > [+Cc Matthew] > > > > On Thu, Jul 14, 2022 at 2:28 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > >> > >> [+cc Kai-Heng, Vidya, who also have ASPM patches in flight] > >> > >> On Wed, Jul 13, 2022 at 07:26:12PM +0800, Manyi Li wrote: > >>> Startup log of ASUSTeK X456UJ Notebook show: > >>> [ 0.130563] ACPI FADT declares the system doesn't support PCIe ASPM, so disable it > >>> [ 48.092472] pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID) > >>> [ 48.092479] pcieport 0000:00:1c.5: device [8086:9d15] error status/mask=00000001/00002000 > >>> [ 48.092481] pcieport 0000:00:1c.5: [ 0] RxErr > >>> [ 48.092490] pcieport 0000:00:1c.5: AER: Corrected error received: 0000:00:1c.5 > >>> [ 48.092504] pcieport 0000:00:1c.5: AER: can't find device of ID00e5 > >>> [ 48.092506] pcieport 0000:00:1c.5: AER: Corrected error received: 0000:00:1c.5 > >> > >> Can you elaborate on the connection between the FADT ASPM bit and the > >> AER logs above? > > Sorry,I don't know about that. > > >> > >> What problem are we solving here? A single corrected error being > >> logged? An infinite stream of errors? A device that doesn't work at > >> all? > > > > Agree, what's the real symptom of the issue? > > Please see the details of this issus: > https://bugzilla.kernel.org/show_bug.cgi?id=216245 > > > > >> > >> We don't need the dmesg timestamps unless they contribute to > >> understanding the problem. I don't think they do in this case. > > > > According to commit 387d37577fdd ("PCI: Don't clear ASPM bits when the > > FADT declares it's unsupported"), the bit means "just use the ASPM > > bits handed over by BIOS". > > > > However, I do wonder why both drivers/pci/pci-acpi.c and > > drivers/acpi/pci_root.c are doing the ACPI_FADT_NO_ASPM check, Because pci_root.c doesn't read aspm_disabled. > > maybe one of them should be removed? Arguably, pci_root.c could look at aspm_disabled instead of looking at the FADT flag directly. > I think duplicate work has been done, but comment > in drivers/acpi/pci_root.c is > * We want to disable ASPM here, but aspm_disabled > * needs to remain in its state from boot so that we > * properly handle PCIe 1.1 devices. So we set this > * flag here, to defer the action until after the ACPI > * root scan. > > I don't understand this logic. This is about the case after failing acpi_pci_osc_control_set() and generally we need to defer setting aspm_disabled because of pcie_aspm_sanity_check(). > > > >> > >>> Signed-off-by: Manyi Li <limanyi@xxxxxxxxxxxxx> > >>> --- > >>> drivers/pci/pcie/aspm.c | 1 + > >>> 1 file changed, 1 insertion(+) > >>> > >>> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > >>> index a96b7424c9bc..b173d3c75ae7 100644 > >>> --- a/drivers/pci/pcie/aspm.c > >>> +++ b/drivers/pci/pcie/aspm.c > >>> @@ -1359,6 +1359,7 @@ void pcie_no_aspm(void) > >>> if (!aspm_force) { > >>> aspm_policy = POLICY_DEFAULT; > >>> aspm_disabled = 1; > >>> + aspm_support_enabled = false; > >> > >> This makes pcie_no_aspm() work the same as booting with > >> "pcie_aspm=off". That might be reasonable. > >> > >> I do wonder why we need both "aspm_disabled" and > >> "aspm_support_enabled". And I wonder why we need to set "aspm_policy" > >> when we're disabling ASPM. But those aren't really connected to your > >> change here. > > > > From what I can understand "aspm_disabled" means "don't touch ASPM > > left by BIOS", and "aspm_support_enabled" means "whether ASPM is > > disabled via command line". > > There seems to be some overlaps though. > > According to commit 8b8bae901ce23 ("PCI/ACPI: Report ASPM support to > BIOS if not disabled from command line"), "aspm_support_enabled" means > whether or not report ASPM support to the BIOS through _OSC. Right.