On 7/28/2021 9:31 PM, Bjorn Helgaas wrote:
External email: Use caution opening links or attachments
[+cc Rafael, driver-specific vs generic PM at very end]
On Wed, Jul 28, 2021 at 06:48:50AM +0530, Om Prakash Singh wrote:
On 7/27/2021 8:48 PM, Bjorn Helgaas wrote:
On Tue, Jul 27, 2021 at 07:51:37AM +0530, Om Prakash Singh wrote:
Hi Bjorn,
I think it makes sense to have the scope of keeping default ASPM
policy disable and API pci_enable_link_state() to selectively enable
by EP Driver.
sysfs interface for ASPM also does not allow enabling ASPM for a
device if the default policy (policy_to_aspm_state()) does not allow
it.
The ASPM policy implementation may require changes. I think the
current setup where a policy is compiled into the kernel via Kconfig
options is seriously flawed.
We need a fail-safe kernel parameter, i.e., "pcie_aspm=off", for cases
where devices don't work at all with ASPM. We need quirks to work
around devices known to be broken, e.g., those that advertise ASPM
support that doesn't actually work, or those that advertise incorrect
exit latencies. I think most other configuration should be done via
sysfs.
Consider a situation, for a platform one wants to utilize ASPM
capability of an onboard PCIe device because it is well evaluated,
at the same time they want to keep ASPM disabled for other PCIe
devices that can be connected on open PCIe slot to avoid possible
performance issue.
I see ASPM is broken on many devices, though the device shows ASPM
capabilities but has performance issues when it is enabled.
I'll wait to see your proposal and use case before commenting on this.
few suggestions:
1. We can have a device-tree property to disable ASPM capabilities of a Root
port. Corresponding to my example, the device-tree can use be use to enable
ASPM capabilities of Root ports that have known/good/onboard PCIe device,
and keep ASPM disabled for opens slots
ASPM can only be enabled for links between two devices. For empty
slots, there's an upstream device (Root Port or Switch Downstream
Port) but no downstream device, so ASPM won't be enabled anyway.
2. sysfs should allow overriding default ASPM policy.
With below change system can boot with default policy ASPM disabled
(performance). But sysfs will still allow to enable ASPM capability of
selective device
This sounds like a good idea. If we could get rid of the Kconfig ASPM
policy selection, we'd likely make PCIEASPM_POWER_SUPERSAVE the
default. Then userspace could use sysfs to disable ASPM states
selective as necessary for performance.
want to clarify this point, my idea was to have to have all ASPM mode
disable by default (PERFORMANCE). And through sysfs user should be
enable selected ASPM mode for each device. With current upstream code
this is not possible. We can introduce one more policy USER_SELECTABLE
if it make more sense.
Keeping ASPM disabled as default policy is important as any rough device
connected can impact on overall system performance.
Adding Heiner Kallweit to share his opinion as well
Even before removing the Kconfig policy selection (or if we can't do
that), I think it make sense to allow sysfs to override it.
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index a08e7d6..268c2c5 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -1278,7 +1278,7 @@
link->aspm_disable |= state;
}
- pcie_config_aspm_link(link, policy_to_aspm_state(link));
+ pcie_config_aspm_link(link, state);
mutex_unlock(&aspm_lock);
up_read(&pci_bus_sem);
3. Add API pci_enable_link_state()
This will give control to driver to change ASPM at runtime depending on
user selected performance mode. For example Android has different
performance modes for Wifi chip, for sleep and active state. We are using
similar API to overcome some performance issue related to wifi on our
internal platform.
I'd prefer to keep performance modes out of the drivers as much as
possible because that's not really a scalable approach. I'm not a
power management expert, but I suspect the most maintainable approach
is to do this in the generic PM core, with selectable overrides via
sysfs as necessary. I cc'd Rafael in case he wants to chime in.
Maybe there's a place for drivers to tweak ASPM at runtime, but I'm
not convinced yet. I think the intent of ASPM is that devices
advertise exit latencies that correspond with their internal
buffering, so the L0s and L1 states can be used with no significant
performance impact. But without more specifics, we can talk about
this all day without getting anywhere.
Bjorn