On Wed, Feb 08, 2023 at 09:15:50AM +0000, Ping-Ke Shih wrote: > > -----Original Message----- > > From: Bjorn Helgaas <helgaas@xxxxxxxxxx> > > On Fri, Aug 19, 2022 at 02:48:10PM +0800, Ping-Ke Shih wrote: > > > From: Chin-Yen Lee <timlee@xxxxxxxxxxx> > > > > > > 8852CE controls CLKREQ, ASPM L1, L1ss via wifi registers > > > instead, so change them accordingly. > > ... > > We get here via this path: > > > > rtw89_pci_probe > > rtw89_pci_l1ss_cfg > > pci_read_config_dword(pdev, l1ss_cap_ptr + PCI_L1SS_CTL1, &l1ss_ctrl); > > if (l1ss_ctrl & PCI_L1SS_CTL1_L1SS_MASK) > > rtw89_pci_l1ss_set(rtwdev, true); > > > > This looks like it might be a problem because L1SS configuration > > is owned by the PCI core, not by the device driver. The PCI core > > provides sysfs user interfaces that can enable and disable L1SS at > > run-time without notification to the driver (see [1]). > > > > The user may enable or disable L1SS using those sysfs interfaces, > > and this code in the rtw89 driver will not be called. > > The chunk of code is to configure L1SS of chip specific setting > along with standard PCI capability, and normally the setting and > capability are consistent. An exception is that PCI capability is > enabled but chip specific setting is disabled, when we want to use > module parameter to disable chip specific setting experimentally to > resolve interoperability problem on some platforms. This is a significant usability problem. An interoperability problem means the device doesn't work correctly for some users, and there's no obvious reason *why* it doesn't work, so they don't know how to fix it. Module parameters are not a solution because users don't know when they are needed or how to use them. This leads to situations like [1,2,3], where users waste a lot of time flailing around to get the device to work, and the eventual "solution" is to replace it with something else: After replacing the Realtek card with Intel AX200 I do not have the described problem anymore. > We don't suggest the use case that L1SS of PCI capability is > disabled but chip specific setting is enabled, because hardware > could get abnormal occasionally. Also, it could also get unexpected > behavior suddenly if we change L1SS dynamically. > > Summary: > > PCI capability chip specific setting comment > -------------- --------------------- ------- > enabled enabled ok, currently support > disabled disabled ok, currently support > enabled disabled experimental case via module parameter > disabled enabled don't suggest I think the fact that you need chip-specific code here is a hardware defect in the rtw89 device. The whole point of L1SS being in the PCIe spec is so generic software can configure it without having to know chip-specific details. > With above reasons, if users meet problem or unexpected result after > changing L1SS, we may tell them this hardware can't dynamically > configure L1SS via sysfs interfaces. How can we make this better, so the device works and users never have to specify those module parameters? Would it help if we had a way to make a quirk that meant "never enable L1SS for this device"? Obviously that's not ideal because we want the power savings of L1SS, but the power saving is only worthwhile if the device always *works*. Or maybe we could have a quirk that means "the PCI core will never change the L1SS configuration for this device"? Would that help? Bjorn [1] https://github.com/lwfinger/rtw89/issues/41 [2] https://bbs.archlinux.org/viewtopic.php?id=273515 [3] https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/1971656