On 02.10.2023 17:14, Bjorn Helgaas wrote: > [+cc Sathy, Lukas] > > On Sat, Aug 26, 2023 at 01:10:35PM +0200, Heiner Kallweit wrote: >> After the referenced commit we may see L1 sub-states being active >> unexpectedly. Following scenario as an example: >> r8169 disables L1 because of known hardware issues on a number of >> systems. Implicitly L1.1 and L1.2 are disabled too. >> On my system L1 and L1.1 work fine, but L1.2 causes missed >> rx packets. Therefore I write 1 to aspm_l1_1. >> This removes ASPM_STATE_L1 from the disabled modes and therefore >> unexpectedly enables also L1.2. So return to the old behavior. >> >> A comment in the commit message of the referenced change correctly points >> out that this behavior is inconsistent with aspm_attr_store_common(). >> So change aspm_attr_store_common() accordingly. > > I think we should split this into a pure revert of fb097dcd5a28 with > the description of the unintended consequence, followed by another > patch to change aspm_attr_store_common(). > OK > I guess the existing aspm_attr_store_common() behavior allows a > similar unexpected behavior? For example, in this sequence: > > - Write 0 to "l1_aspm" to disable L1 > - Write 1 to "l1_1_aspm" to enable L1.1 > > does L1.2 get implicitly enabled as well even though that's clearly > not what the user intended? > Right, it's the same here. Therefore the second change in the patch. > There's also the separate question of how the sysfs file and the > pci_disable_link_state() API should interact. Drivers use that API > when they know about a defect in their device, but the user can > override that via syfs. > > In [1], we have a similar situation with D3cold support, where we're > thinking that we should not allow a user to use sysfs to override that > driver knowledge. > In my r8169 use case ASPM works fine on one system, but causes issues on another one with same NIC chip version. So it may be a wild mix of NIC hw erratum, BIOS bug, mainboard chipset incompatibilities, etc. Therefore I disable L1 per default, and allow users to re-enable L1/L1.1/L1.2 if their system isn't affected. > Bjorn > > [1] https://lore.kernel.org/r/b8a7f4af2b73f6b506ad8ddee59d747cbf834606.1695025365.git.lukas@xxxxxxxxx > >> Fixes: fb097dcd5a28 ("PCI/ASPM: Disable only ASPM_STATE_L1 when driver disables L1") >> Signed-off-by: Heiner Kallweit <hkallweit1@xxxxxxxxx> >> --- >> drivers/pci/pcie/aspm.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c >> index 3dafba0b5..6d3788257 100644 >> --- a/drivers/pci/pcie/aspm.c >> +++ b/drivers/pci/pcie/aspm.c >> @@ -1063,7 +1063,7 @@ static int __pci_disable_link_state(struct pci_dev *pdev, int state, bool sem) >> if (state & PCIE_LINK_STATE_L0S) >> link->aspm_disable |= ASPM_STATE_L0S; >> if (state & PCIE_LINK_STATE_L1) >> - link->aspm_disable |= ASPM_STATE_L1; >> + link->aspm_disable |= ASPM_STATE_L1 | ASPM_STATE_L1SS; >> if (state & PCIE_LINK_STATE_L1_1) >> link->aspm_disable |= ASPM_STATE_L1_1; >> if (state & PCIE_LINK_STATE_L1_2) >> @@ -1251,6 +1251,8 @@ static ssize_t aspm_attr_store_common(struct device *dev, >> link->aspm_disable &= ~ASPM_STATE_L1; >> } else { >> link->aspm_disable |= state; >> + if (state & ASPM_STATE_L1) >> + link->aspm_disable |= ASPM_STATE_L1SS; >> } >> >> pcie_config_aspm_link(link, policy_to_aspm_state(link)); >> -- >> 2.42.0 >>