Hi Lijo, > Could you provide the pp_dpm_* values in sysfs with and without the > patch? Also, could you try forcing PCIE to gen3 (through pp_dpm_pcie) > if it's not in gen3 when the issue happens? AFAICT, I can't access those values while the AMD GPU PCI devices are bound to `vfio-pci`. However, I can at least access the link speed and width elsewhere in sysfs. So, I gathered what information I could for two different cases: - With the PCI devices bound to `vfio-pci`. With this configuration, I can start the VM, but the `pp_dpm_*` values are not available since the devices are bound to `vfio-pci` instead of `amdgpu`. - Without the PCI devices bound to `vfio-pci` (i.e. after removing the `vfio-pci.ids=...` kernel command line argument). With this configuration, I can access the `pp_dpm_*` values, since the PCI devices are bound to `amdgpu`. However, I cannot use the VM. If I try to start the VM, the display (both the external monitors attached to the AMD GPU and the built-in laptop display attached to the Intel iGPU) completely freezes. The output shown below was identical for both the good commit: f1688bd69ec4 ("drm/amd/amdgpu:save psp ring wptr to avoid attack") and the commit which introduced the issue: f9b7f3703ff9 ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)") Note that the PCI link speed increased to 8.0 GT/s when the GPU was under heavy load for both versions, but the clock speeds of the GPU were different under load. (For the good commit, it was 1295 MHz; for the bad commit, it was 501 MHz.) # With the PCI devices bound to `vfio-pci` ## Before starting the VM % ls /sys/module/amdgpu/drivers/pci:amdgpu module bind new_id remove_id uevent unbind % find /sys/bus/pci/devices/0000:01:00.0/ -type f -name 'current_link*' -print -exec cat {} \; /sys/bus/pci/devices/0000:01:00.0/current_link_width 8 /sys/bus/pci/devices/0000:01:00.0/current_link_speed 8.0 GT/s PCIe ## While running the VM, before placing the AMD GPU under heavy load % find /sys/bus/pci/devices/0000:01:00.0/ -type f -name 'current_link*' -print -exec cat {} \; /sys/bus/pci/devices/0000:01:00.0/current_link_width 8 /sys/bus/pci/devices/0000:01:00.0/current_link_speed 2.5 GT/s PCIe ## While running the VM, with the AMD GPU under heavy load % find /sys/bus/pci/devices/0000:01:00.0/ -type f -name 'current_link*' -print -exec cat {} \; /sys/bus/pci/devices/0000:01:00.0/current_link_width 8 /sys/bus/pci/devices/0000:01:00.0/current_link_speed 8.0 GT/s PCIe ## While running the VM, after stopping the heavy load on the AMD GPU % find /sys/bus/pci/devices/0000:01:00.0/ -type f -name 'current_link*' -print -exec cat {} \; /sys/bus/pci/devices/0000:01:00.0/current_link_width 8 /sys/bus/pci/devices/0000:01:00.0/current_link_speed 2.5 GT/s PCIe ## After stopping the VM % find /sys/bus/pci/devices/0000:01:00.0/ -type f -name 'current_link*' -print -exec cat {} \; /sys/bus/pci/devices/0000:01:00.0/current_link_width 8 /sys/bus/pci/devices/0000:01:00.0/current_link_speed 2.5 GT/s PCIe # Without the PCI devices bound to `vfio-pci` % ls /sys/module/amdgpu/drivers/pci:amdgpu 0000:01:00.0 module bind new_id remove_id uevent unbind % for f in /sys/module/amdgpu/drivers/pci:amdgpu/*/pp_dpm_*; do echo "$f"; cat "$f"; echo; done /sys/module/amdgpu/drivers/pci:amdgpu/0000:01:00.0/pp_dpm_mclk 0: 300Mhz 1: 625Mhz 2: 1500Mhz * /sys/module/amdgpu/drivers/pci:amdgpu/0000:01:00.0/pp_dpm_pcie 0: 2.5GT/s, x8 1: 8.0GT/s, x16 * /sys/module/amdgpu/drivers/pci:amdgpu/0000:01:00.0/pp_dpm_sclk 0: 214Mhz 1: 501Mhz 2: 850Mhz 3: 1034Mhz 4: 1144Mhz 5: 1228Mhz 6: 1275Mhz 7: 1295Mhz * % find /sys/bus/pci/devices/0000:01:00.0/ -type f -name 'current_link*' -print -exec cat {} \; /sys/bus/pci/devices/0000:01:00.0/current_link_width 8 /sys/bus/pci/devices/0000:01:00.0/current_link_speed 8.0 GT/s PCIe James