https://bugzilla.kernel.org/show_bug.cgi?id=205089 Manuel Jesús de la Fuente (m@xxxxxxxxx) changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |m@xxxxxxxxx --- Comment #40 from Manuel Jesús de la Fuente (m@xxxxxxxxx) --- Can still reproduce using the following: - Ryzen 9 5900XT - Radeon RX 6700XT - Linux 5.17.4-1-default (openSUSE Tumbleweed with KDE Plasma) - Mesa 22.0.2-308.2 May 08 20:18:32 localhost.localdomain kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=2371535, emitted seq=2371537 May 08 20:18:32 localhost.localdomain kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process kwin_x11 pid 1795 thread kwin_x11:cs0 pid 1801 May 08 20:18:32 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: GPU reset begin! May 08 20:18:33 localhost.localdomain kernel: amdgpu 0000:2d:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110) May 08 20:18:33 localhost.localdomain kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ disable failed May 08 20:18:33 localhost.localdomain kernel: amdgpu 0000:2d:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110) May 08 20:18:33 localhost.localdomain kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed May 08 20:18:33 localhost.localdomain kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx May 08 20:18:33 localhost.localdomain kernel: [drm] free PSP TMR buffer May 08 20:18:33 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: MODE1 reset May 08 20:18:33 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: GPU mode1 reset May 08 20:18:33 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: GPU smu mode1 reset May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: GPU reset succeeded, trying to resume May 08 20:18:34 localhost.localdomain kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000300000). May 08 20:18:34 localhost.localdomain kernel: [drm] VRAM is lost due to GPU reset! May 08 20:18:34 localhost.localdomain kernel: [drm] PSP is resuming... May 08 20:18:34 localhost.localdomain kernel: [drm] reserve 0xa00000 from 0x82fe000000 for PSP TMR May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: RAS: optional ras ta ucode is not available May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: SMU is resuming... May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: smu driver if version = 0x0000000e, smu fw if version = 0x00000012, smu fw version = 0x00413500 (65.53.0) May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: SMU driver if version not matched May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: SMU is resumed successfully! May 08 20:18:34 localhost.localdomain kernel: [drm] DMUB hardware initialized: version=0x0202000C May 08 20:18:34 localhost.localdomain kernel: [drm] kiq ring mec 2 pipe 1 q 0 May 08 20:18:34 localhost.localdomain kernel: [drm] VCN decode and encode initialized successfully(under DPG Mode). May 08 20:18:34 localhost.localdomain kernel: [drm] JPEG decode initialized successfully. May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0 May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0 May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0 May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0 May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0 May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0 May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0 May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0 May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0 May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0 May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0 May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0 May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1 May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1 May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1 May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1 May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: recover vram bo from shadow start May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: recover vram bo from shadow done May 08 20:18:34 localhost.localdomain kernel: [drm] Skip scheduling IBs! May 08 20:18:34 localhost.localdomain kernel: [drm] Skip scheduling IBs! May 08 20:18:34 localhost.localdomain kernel: amdgpu 0000:2d:00.0: amdgpu: GPU reset(2) succeeded! May 08 20:18:34 localhost.localdomain kernel: [drm] Skip scheduling IBs! [ ... the previous line, but loads of times ] May 08 20:18:34 localhost.localdomain kernel: [drm] Skip scheduling IBs! May 08 20:18:34 localhost.localdomain kernel: amdgpu_cs_ioctl: 46 callbacks suppressed May 08 20:18:34 localhost.localdomain kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125! [ ... the previous line, but loads of times. These are the '-125!' ones ] May 08 20:18:44 localhost.localdomain kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125! May 08 20:18:44 localhost.localdomain xembedsniproxy[1862]: Container window visible, stack below May 08 20:18:44 localhost.localdomain kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125! One interesting detail/partial workaround is that underclocking the RAM speed helps reduce it. Setting it to 2400 especifically (native speed of the 32GB of ram is 3600) makes it happen much less often (still does happen though). Another thing is that it might be somehow related to the GPU's built in audio conflicting with intel's snd_hda_intel, which is part of a few other's logs (sometimes appearing for me too). Audio is also choppy until a Pulse restart with pulseaudio -k, which might be the cause for this first freeze with RAM at 2400. This may be unrelated though, and is just conjecture from my part. Happy to help debug the issue if anyone can guide me through the process a bit. Will also take a look at reporting this to the Mesa side too. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.