On 6/3/2019 10:52 PM, Bjorn Helgaas wrote: > [+cc Rafael, just FYI] > > On Mon, Jun 03, 2019 at 01:30:51PM +0530, Abhishek Sahu wrote: >> On 6/1/2019 2:09 AM, Bjorn Helgaas wrote: >>> On Fri, May 31, 2019 at 10:31:09AM +0530, Abhishek Sahu wrote: >>>> NVIDIA Turing GPUs include hardware support for USB Type-C and >>>> VirtualLink. It helps in delivering the power, display, and data >>>> required to power VR headsets through a single USB Type-C connector. >>>> The Turing GPU is a multi-function PCI device has the following >>>> four functions: >>>> >>>> - VGA display controller (Function 0) >>>> - Audio controller (Function 1) >>>> - USB xHCI Host controller (Function 2) >>>> - USB Type-C USCI controller (Function 3) >>>> >>>> The function 0 is tightly coupled with other functions in the >>>> hardware. When function 0 goes in runtime suspended state, >>>> then it will do power gating for most of the hardware blocks. >>>> Some of these hardware blocks are used by other functions which >>>> leads to functional failure. So if any of these functions (1/2/3) >>>> are active, then function 0 should also be in active state. >>> >>>> 'commit 07f4f97d7b4b ("vga_switcheroo: Use device link for >>>> HDA controller")' creates the device link from function 1 to >>>> function 0. A similar kind of device link needs to be created >>>> between function 0 and functions 2 and 3 for NVIDIA Turing GPU. >>> >>> I can't point to language that addresses this, but this sounds like a >>> case of the GPU not conforming to the PCI spec. The general >>> assumption is that the OS should be able to discover everything it >>> needs to do power management directly from the architected PCI config >>> space. >> >> The GPU is following PCIe spec but following is the implementation >> from HW side > > Unless you can find spec language that talks about D-state > dependencies between functions, I claim this is not following the > PCIe spec. For example, PCIe r5.0, sec 1.4, says "the PCI/PCIe > hardware/software model includes architectural constructs necessary to > discover, configure, and use a Function, without needing Function- > specific knowledge." Sec 5.1 says "D states are associated with a > particular Function" and "PM provides ... a mechanism to identify > power management capabilities of a given Function [and] the ability to > transition a Function into a certain power management state." > Thanks Bjorn. Here in case of GPU's these functions are not completely independent so it is not following PCIe spec in that aspect. > If there *is* something about dependencies between functions in the > spec, we should improve the generic PCI core to pay attention to that, > and then we wouldn't need this quirk. > > If the spec doesn't provide a way to discover them, these dependencies > are exceptions from the spec, and we have to handle them as hardware > defects, using quirks like this. That's fine, but let's not pretend > that this is a conforming device and that adding quirks is the > expected process. Just call a spade a spade and say we're working > around a defect in this particular device. > Yes. I am agree with that we need to be very careful in adding quirks like this. I will communicate the same to HW team so they can explore other options to handle this in HW design side for future chips. > I think the best path forward would be to add this quirk for the > existing device, and then pursue a spec change to add something like > a new PCIe capability to describe the dependencies. Then we could > enhance the PCI core once and power management for future devices > would "Just Work" without having to add quirks. > Yes. It will be long term process. If other HW has similar requirement then it would be good to have this. Regards, Abhishek