On 2023-03-01 12:10, Abhishek Sahu wrote: > So D3cold is not supported on this system. > Most of the desktop systems doesn’t support D3cold. > In that case, as Alex mentioned that after that patch the root port can also > go into D3hot state. > > Another difference is that earlier we were changing the device power state by > directly writing into PCI PM_CTRL registers. Now, we are using kernel generic > runtime PM function to perform the same. > > We need to print the root port runtime status and power_state as Alex mentioned. Understood. Thanks for explaining! > Apart from that, can we try following things to get more information, > > Before binding the Device to vfio-pci driver, disable the runtime power > management of the root port > > # echo on > /sys/bus/pci/devices/<root_port B:D:F>/power/control > > After this, bind the device to vfio-pci driver and check the runtime status and power_state > for both device and root port. The root port runtime_status should be active and power_state > should be D0. > > With the runtime PM disabled for the root port, check if this issue happens. > It will give clue if the root port going into D3hot status is causing the issue or > the use of runtime PM to put device into D3hot is causing this. I prevented vfio-pci from loading automatically on boot, and booted with nomodeset. I set the power control to manual (echo on), and then loaded vfio-pci. At that point, the root port was at D0 (never entered D3hot) but the card was at D3hot. There were no errors in dmesg when I started the VM. On boot: ==> 6_2_before_vm_before_vfio_before_manual <== # cat /sys/bus/pci/devices/0000:03:02.0/power_state D0 # cat /sys/bus/pci/devices/0000:03:02.0/power/runtime_status active # cat /sys/bus/pci/devices/0000:03:02.0/power/control auto # cat /sys/bus/pci/devices/0000:06:00.0/power_state unknown # cat /sys/bus/pci/devices/0000:06:00.0/power/runtime_status active # cat /sys/bus/pci/devices/0000:06:00.0/power/control on After echo on > [..]/power/control ==> 6_2_before_vm_before_vfio_after_manual <== # cat /sys/bus/pci/devices/0000:03:02.0/power_state D0 # cat /sys/bus/pci/devices/0000:03:02.0/power/runtime_status active # cat /sys/bus/pci/devices/0000:03:02.0/power/control on # cat /sys/bus/pci/devices/0000:06:00.0/power_state unknown # cat /sys/bus/pci/devices/0000:06:00.0/power/runtime_status active # cat /sys/bus/pci/devices/0000:06:00.0/power/control on After loading vfio-pci: ==> 6_2_before_vm_after_vfio_after_manual <== # cat /sys/bus/pci/devices/0000:03:02.0/power_state D0 # cat /sys/bus/pci/devices/0000:03:02.0/power/runtime_status active # cat /sys/bus/pci/devices/0000:03:02.0/power/control on # cat /sys/bus/pci/devices/0000:06:00.0/power_state D3hot # cat /sys/bus/pci/devices/0000:06:00.0/power/runtime_status suspended # cat /sys/bus/pci/devices/0000:06:00.0/power/control auto And finally, while the VM was running: ==> 6_2_running_vm_after_vfio_after_manual <== # cat /sys/bus/pci/devices/0000:03:02.0/power_state D0 # cat /sys/bus/pci/devices/0000:03:02.0/power/runtime_status active # cat /sys/bus/pci/devices/0000:03:02.0/power/control on # cat /sys/bus/pci/devices/0000:06:00.0/power_state D0 # cat /sys/bus/pci/devices/0000:06:00.0/power/runtime_status active # cat /sys/bus/pci/devices/0000:06:00.0/power/control auto > This “Completion-Wait loop timed out with vfio” prints is coming > from the IOMMU driver. Can you please check once by adding ‘pci=realloc’ > in your separate installation and see if we the memory are enabled after > D3hot cycles. If memory is getting disabled only after D3hot cycles with > ‘pci=realloc’, then we need to find out at which stage it is happening > (when the device is going into D3hot or when root port is going into D3hot). > > For this we can disable the runtime PM of both device and root port before > binding the device to vfio-pci driver. Then enable runtime PM of device first > and wait for it to go into suspended state. Then check lspci output. > Then enable the same for root port and check lspci output. Assuming I understood this correctly, I added pci=realloc, and the memory was disabled while in D3hot, but enabled as expected on D0. While running the below commands, power control was set to the default "auto" after a fresh reboot. Before running the VM: # lspci -vvnn -s 0000:06:00.0 06:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Bonaire XT [Radeon HD 7790/8770 / R7 360 / R9 260/360 OEM] [1002:665c] (prog-if 00 [VGA controller]) Subsystem: ASUSTeK Computer Inc. Radeon HD 7790 DirectCU II OC [1043:0452] Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Interrupt: pin A routed to IRQ 255 Region 0: Memory at 1030000000 (64-bit, prefetchable) [disabled] [size=256M] Region 2: Memory at 1040000000 (64-bit, prefetchable) [disabled] [size=8M] Region 4: I/O ports at d000 [disabled] [size=256] Region 5: Memory at d0100000 (32-bit, non-prefetchable) [disabled] [size=256K] Expansion ROM at <ignored> [disabled] Capabilities: [48] Vendor Specific Information: Len=08 <?> Capabilities: [50] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold-) Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME- Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+ RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend- LnkCap: Port #2, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 8GT/s (ok), Width x4 (downgraded) TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis-, NROPrPrP-, LTR- 10BitTagComp-, 10BitTagReq-, OBFF Not Supported, ExtFmt+, EETLPPrefix+, MaxEETLPPrefixes 1 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit- FRS- AtomicOpsCap: 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled AtomicOpsCtl: ReqEn- LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+, EqualizationPhase1+ EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest- Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?> Capabilities: [150 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn- MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: 00000000 00000000 00000000 00000000 Capabilities: [270 v1] Secondary PCI Express LnkCtl3: LnkEquIntrruptEn-, PerformEqu- LaneErrStat: 0 Capabilities: [2b0 v1] Address Translation Service (ATS) ATSCap: Invalidate Queue Depth: 00 ATSCtl: Enable+, Smallest Translation Unit: 00 Capabilities: [2c0 v1] Page Request Interface (PRI) PRICtl: Enable- Reset- PRISta: RF- UPRGI- Stopped+ Page Request Capacity: 00000020, Page Request Allocation: 00000000 Capabilities: [2d0 v1] Process Address Space ID (PASID) PASIDCap: Exec+ Priv+, Max PASID Width: 10 PASIDCtl: Enable- Exec- Priv- Kernel driver in use: vfio-pci Kernel modules: amdgpu While running the VM (with IOMMU errors): # lspci -vvnn -s 0000:06:00.0 06:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Bonaire XT [Radeon HD 7790/8770 / R7 360 / R9 260/360 OEM] [1002:665c] (prog-if 00 [VGA controller]) Subsystem: ASUSTeK Computer Inc. Radeon HD 7790 DirectCU II OC [1043:0452] Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Interrupt: pin A routed to IRQ 42 Region 0: Memory at 1030000000 (64-bit, prefetchable) [size=256M] Region 2: Memory at 1040000000 (64-bit, prefetchable) [size=8M] Region 4: I/O ports at d000 [size=256] Region 5: Memory at d0100000 (32-bit, non-prefetchable) [size=256K] Expansion ROM at <ignored> [disabled] Capabilities: [48] Vendor Specific Information: Len=08 <?> Capabilities: [50] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+ RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend- LnkCap: Port #2, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 8GT/s (ok), Width x4 (downgraded) TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis-, NROPrPrP-, LTR- 10BitTagComp-, 10BitTagReq-, OBFF Not Supported, ExtFmt+, EETLPPrefix+, MaxEETLPPrefixes 1 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit- FRS- AtomicOpsCap: 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled AtomicOpsCtl: ReqEn- LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+, EqualizationPhase1+ EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest- Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?> Capabilities: [150 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn- MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: 00000000 00000000 00000000 00000000 Capabilities: [270 v1] Secondary PCI Express LnkCtl3: LnkEquIntrruptEn-, PerformEqu- LaneErrStat: 0 Capabilities: [2b0 v1] Address Translation Service (ATS) ATSCap: Invalidate Queue Depth: 00 ATSCtl: Enable+, Smallest Translation Unit: 00 Capabilities: [2c0 v1] Page Request Interface (PRI) PRICtl: Enable- Reset- PRISta: RF- UPRGI- Stopped+ Page Request Capacity: 00000020, Page Request Allocation: 00000000 Capabilities: [2d0 v1] Process Address Space ID (PASID) PASIDCap: Exec+ Priv+, Max PASID Width: 10 PASIDCtl: Enable- Exec- Priv- Kernel driver in use: vfio-pci Kernel modules: amdgpu -- Tasos