On Thu, Oct 31, 2019 at 06:47:10PM +0800, Dilip Kota wrote: > On 10/31/2019 6:14 AM, Bjorn Helgaas wrote: > > On Tue, Oct 29, 2019 at 05:31:18PM +0800, Dilip Kota wrote: > > > On 10/22/2019 8:59 PM, Bjorn Helgaas wrote: > > > > [+cc Rafael, linux-pm, beginning of discussion at > > > > https://lore.kernel.org/r/d8574605f8e70f41ce1e88ccfb56b63c8f85e4df.1571638827.git.eswara.kota@xxxxxxxxxxxxxxx] > > > > > > > > On Tue, Oct 22, 2019 at 05:27:38PM +0800, Dilip Kota wrote: > > > > > On 10/22/2019 1:18 AM, Bjorn Helgaas wrote: > > > > > > On Mon, Oct 21, 2019 at 02:38:50PM +0100, Andrew Murray wrote: > > > > > > > On Mon, Oct 21, 2019 at 02:39:20PM +0800, Dilip Kota wrote: > > > > > > > > PCIe RC driver on Intel Gateway SoCs have a requirement > > > > > > > > of changing link width and speed on the fly. > > > > > > Please add more details about why this is needed. Since > > > > > > you're adding sysfs files, it sounds like it's not > > > > > > actually the *driver* that needs this; it's something in > > > > > > userspace? > > > > > We have use cases to change the link speed and width on the fly. > > > > > One is EMI check and other is power saving. Some battery backed > > > > > applications have to switch PCIe link from higher GEN to GEN1 and > > > > > width to x1. During the cases like external power supply got > > > > > disconnected or broken. Once external power supply is connected then > > > > > switch PCIe link to higher GEN and width. > > > > That sounds plausible, but of course nothing there is specific to the > > > > Intel Gateway, so we should implement this generically so it would > > > > work on all hardware. > > > Agree. > > > > I'm not sure what the interface should look like -- should it be a > > > > low-level interface as you propose where userspace would have to > > > > identify each link of interest, or is there some system-wide > > > > power/performance knob that could tune all links? Cc'd Rafael and > > > > linux-pm in case they have ideas. > > > To my knowledge sysfs is the appropriate way to go. > > > If there are any other best possible knobs, will be helpful. > > I agree sysfs is the right place for it; my question was whether we > > should have files like: > > > > /sys/.../0000:00:1f.3/pcie_speed > > /sys/.../0000:00:1f.3/pcie_width > > > > as I think this patch would add (BTW, please include sample paths like > > the above in the commit log), or whether there should be a more global > > thing that would affect all the links in the system. > Sure, i will add them. > > > > I think the low-level files like you propose would be better because > > one might want to tune link performance differently for different > > types of devices and workloads. > > > > We also have to decide if these files should be associated with the > > device at the upstream or downstream end of the link. For ASPM, the > > current proposal [1] has the files at the downstream end on the theory > > that the GPU, NIC, NVMe device, etc is the user-recognizable one. > > Also, neither ASPM nor link speed/width make any sense unless there > > *is* a device at the downstream end, so putting them there > > automatically makes them visible only when they're useful. > > This patch places the speed and width in the host controller directory. > /sys/.../xxx.pcie/pcie_speed > /sys/.../xxx.pcie/pcie_width > > I agree with you partially, because i am having couple of points > making me to keep speed and width change entries in controller > directory: > > -- For changing the speed/width with device node, software ends up > traversing to the controller from the device and do the > operations. > -- Change speed and width are performed at controller level, The controller is effectively a Root Complex, which may contain several Root Ports. I have the impression that the Synopsys controller only supports a single Root Port, but that's just a detail of the Synopsys implementation. I think it should be possible to configure the width/speed of each Root Port individually. > -- Keeping speed and width in controller gives a perspective (to the > user) of changing them only once irrespective of no. of devices. What if there's a switch? If we change the width/speed of the link between the Root Port and the Switch Upstream Port, that doesn't do anything about the links from the Switch Downstream Ports. > -- For speed and link change in Synopsys PCIe controller, specific > registers need to be configured. This prevents or complicates > adding the speed and width change functionality in pci-sysfs or > pci framework. Don't the Link Control and related registers in PCIe spec give us enough control to manage the link width/speed of *all* links, including those from Root Ports and Switch Downstream Ports? If the Synopsys controller requires controller-specific registers, that sounds to me like it doesn't quite conform to the spec. Maybe that means we would need some sort of quirk or controller callback? Bjorn