On 03.04.2019 15:14, Bjorn Helgaas wrote: > [+cc Frederick] > > On Wed, Apr 03, 2019 at 07:53:40AM +0200, Heiner Kallweit wrote: >> On 02.04.2019 23:57, Bjorn Helgaas wrote: >>> On Tue, Apr 02, 2019 at 10:41:20PM +0200, Heiner Kallweit wrote: >>>> On 02.04.2019 22:16, Florian Fainelli wrote: >>>>> On 4/2/19 12:55 PM, Heiner Kallweit wrote: >>>>>> There are numerous reports about different problems caused by ASPM >>>>>> incompatibilities between certain network chip versions and board >>>>>> chipsets. On the other hand on (especially mobile) systems where ASPM >>>>>> works properly it can significantly contribute to power-saving and >>>>>> increased battery runtime. >>>>>> One problem so far to make ASPM configurable was to find an acceptable >>>>>> way of configuration (e.g. module parameters are discouraged for that >>>>>> purpose). >>>>>> >>>>>> As a new attempt let's switch off ASPM per default and make it >>>>>> configurable by a sysfs attribute. The attribute is documented in >>>>>> new file Documentation/networking/device_drivers/realtek/r8169.txt. >>> >>> Both module parameters and sysfs attributes are a poor user >>> experience. It's very difficult for users to figure out that >>> a tweak is needed. >>> >>>>> I am not sure this is where it should be solved, there is >>>>> definitively a device specific aspect to properly supporting the >>>>> enabling of ASPM L0s, L1s etc, but the actual sysfs knobs should >>>>> belong to the pci_device itself, since this is something that >>>>> likely other drivers would want to be able to expose. You would >>>>> probably want to work with the PCI maintainers to come up with a >>>>> standard solution that applies beyond just r8169 since presumably >>>>> there must be a gazillion of devices with the same issues. >>> >>> The Linux PCI core support for ASPM is poor. Without more details, >>> it's impossible to tell whether these issues are hardware or firmware >>> defects on the device itself, or something that Linux is doing wrong. >>> There are several known defects, especially related to L1 substates >>> and hotplug. >>> >> The vendor refuses to release datasheets or errata. Only certain >> combinations of board chipsets (and maybe BIOS versions) and network >> chip versions (from the ~ 50 supported by the driver) seem to be >> affected. One typical symptom is missed RX packets, maybe the RX FIFO >> isn't big enough to buffer all packets until PCIe has woken up. >> The Windows vendor driver uses a hack, they dynamically disable ASPM >> under load. > > I'm not super sympathetic to vendors like that or to OEMs that work > with them. If we can make the NIC work reliably by disabling ASPM, > that's step one. If we can figure out how to extend battery life by > enabling ASPM in some cases, great, but we have to be careful to do it > in a way that is supportable and doesn't generate lots of user > complaints that require debugging. > > That said, I think Frederick has already started working on a plan for > the PCI core to expose sysfs files to manage ASPM. This is similar to > the link_state files enabled by CONFIG_PCIEASPM_DEBUG, but it will be > always enabled and probably structured slightly differently. The idea > is that this would be generic and would not require any driver > support. > Thanks, Bjorn! Frederick, is there anything you could share already? Or any timeline? Based on Bjorns info what seems to be best to me: 1. Disable ASPM for r8169 on stable (back to 4.19). 2. Once the generic ASPM sysfs attributes are available, reenable ASPM for r8169 in net-next. > Bjorn > Heiner