On Tue, Oct 26, 2021 at 1:31 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > On Mon, Oct 25, 2021 at 06:33:50PM +0800, Kai-Heng Feng wrote: > > On Thu, Oct 21, 2021 at 11:09 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > > On Thu, Oct 21, 2021 at 11:51:59AM +0800, Kai-Heng Feng wrote: > > > > Sometimes BIOS may not be able to program ASPM and LTR settings, for > > > > instance, the NVMe devices behind Intel VMD bridges. For this case, both > > > > ASPM and LTR have to be enabled to have significant power saving. > > > > > > > > Since we still want POLICY_DEFAULT honor the default BIOS ASPM settings, > > > > introduce LTR sysfs knobs so users can set max snoop and max nosnoop > > > > latency manually or via udev rules. > > > > > > How is the user supposed to figure out what "max snoop" and "max > > > nosnoop" values to program? > > > > Actually, the only way I know is to get the value from other OS. > > I don't see how this can be a workable solution. IIUC this is > specifically related to ASPM L1.2. L1.2 depends on LTR (the max > snoop/nosnoop values) and the TPOWER_ON and Common_Mode_Restore_Time > values. PCIe r5.0, sec 5.5.4, says: > > Prior to setting either or both of the enable bits for L1.2, the > values for TPOWER_ON, Common_Mode_Restore_Time, and, if the ASPM > L1.2 Enable bit is to be Set, the LTR_L1.2_THRESHOLD (both Value and > Scale fields) must be programmed. > > The TPOWER_ON and Common_Mode_Restore_Time fields must be programmed > to the appropriate values based on the components and AC coupling > capacitors used in the connection linking the two components. The > determination of these values is design implementation specific. > > I do not think collecting values from some other OS is a reasonable > way to set these. The values would apparently depend on the > electrical design of the particular machine. What if we fallback to the original approach and use the VMD driver to enable ASPM and LTR values? At least I think Intel should be able to provide correct values for their SoC. > > > > If we add this, I'm afraid we'll have people programming random things > > > that seem to work but are not actually reliable. > > > > IMO users need to take full responsibility for own doings. > > Also, it's already doable by using setpci... > > I don't think it currently does, but setpci should taint the kernel. > > If users want to write setpci scripts to fiddle with stuff, that's > great, but that moves it outside the supportable realm. If we provide > a sysfs interface to do this, then it becomes more of *our* problem to > make it work correctly, and I don't think that's practical in this > case. OK. > > > If we don't want to add LTR sysfs, what other options do we have to > > enable VMD ASPM and LTR by default since BIOS doesn't do it for us? > > 1) Enable it in the PCI or VMD driver, however this approach violates > > POLICY_DEFAULT. > > 2) Use `setpci` directly in udev rules to enable VMD ASPM and LTR. > > > > I think 2) is bad, and since 1) isn't so good either, the approach in > > this patch may be the best compromise. > > I do not know how to safely enable L1.2. It's likely that I'm just > missing something, but I don't see enough information in PCI config > space and the PCI Firmware interface (_DSM for Latency Tolerance > Reporting) to enable L1.2. It's possible that a new firmware > interface is required. I was told by Intel that they didn't use _DSM to get LTR values, at least not the VMD case. > > I don't think it's wise to enable L1.2 unless we have good confidence > that we know how to do it correctly. It's hard enough to debug ASPM > issues as it is. So what other options do we have if we want to enable VMD ASPM while keeping CONFIG_PCIEASPM_DEFAULT=y? Right now we enabled the VMD ASPM/LTR bits in downstream kernel, but other distro users may want to have upstream support for this. Kai-Heng > > Bjorn