[Bug 201957] amdgpu: ring gfx timeout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



https://bugzilla.kernel.org/show_bug.cgi?id=201957

rafael castillo (jrch2k10@xxxxxxxxx) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jrch2k10@xxxxxxxxx

--- Comment #80 from rafael castillo (jrch2k10@xxxxxxxxx) ---
same issue here with (also LTS kernel as well)

Linux archlinux 5.18.7-262-tkg-pds #1 TKG SMP PREEMPT_DYNAMIC Mon, 27 Jun 2022
15:50:06 +0000 x86_64 GNU/Linux

[11090.086287] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11090.086296] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11090.086302] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11090.195133] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11090.195139] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11090.195143] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11090.195150] [drm] Cannot get clockgating state when UVD is powergated.
[11090.195152] [drm] Cannot get clockgating state when VCE is powergated.
[11090.695288] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11090.699331] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11091.194893] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11091.194898] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11091.194901] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11091.194908] [drm] Cannot get clockgating state when UVD is powergated.
[11091.194909] [drm] Cannot get clockgating state when VCE is powergated.
[11091.695473] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11092.194965] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11092.194969] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11092.194973] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11092.194979] [drm] Cannot get clockgating state when UVD is powergated.
[11092.194980] [drm] Cannot get clockgating state when VCE is powergated.
[11092.695749] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11093.195046] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11093.195050] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11093.195053] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11093.195060] [drm] Cannot get clockgating state when UVD is powergated.
[11093.195061] [drm] Cannot get clockgating state when VCE is powergated.
[11093.695004] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11094.195065] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11094.195070] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11094.195074] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11094.195082] [drm] Cannot get clockgating state when UVD is powergated.
[11094.195083] [drm] Cannot get clockgating state when VCE is powergated.
[11094.695286] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11095.131026] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for
fences timed out!
[11095.195055] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11095.195061] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11095.195065] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11095.195071] [drm] Cannot get clockgating state when UVD is powergated.
[11095.195072] [drm] Cannot get clockgating state when VCE is powergated.
[11095.695232] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11096.195132] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11096.195137] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11096.195140] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11096.195146] [drm] Cannot get clockgating state when UVD is powergated.
[11096.195147] [drm] Cannot get clockgating state when VCE is powergated.
[11096.694900] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11097.195057] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11097.195061] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11097.195064] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11097.195070] [drm] Cannot get clockgating state when UVD is powergated.
[11097.195071] [drm] Cannot get clockgating state when VCE is powergated.
[11097.695156] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11098.195054] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11098.195058] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11098.195062] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11098.195068] [drm] Cannot get clockgating state when UVD is powergated.
[11098.195069] [drm] Cannot get clockgating state when VCE is powergated.
[11098.695226] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11099.195056] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11099.195060] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11099.195064] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11099.195070] [drm] Cannot get clockgating state when UVD is powergated.
[11099.195071] [drm] Cannot get clockgating state when VCE is powergated.
[11099.695224] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11100.175702] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout,
signaled seq=2678111, emitted seq=2678113
[11100.175937] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information:
process ArcheAge.exe pid 702264 thread ArcheAge.e:cs0 pid 703382
[11100.176120] amdgpu 0000:02:00.0: amdgpu: GPU reset begin!
[11104.176155] amdgpu 0000:02:00.0: amdgpu: failed to suspend display audio
[11104.176290] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11104.176294] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11104.176296] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11104.176298] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11104.176299] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11104.176301] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11104.176303] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11104.176305] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11104.176307] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11104.176309] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11104.176311] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11104.176312] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11104.176314] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11104.176316] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11104.176318] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11104.176320] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11104.176321] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11104.176417] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11104.176420] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11104.176421] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11104.176423] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11104.176425] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11104.176427] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11118.768958] audit: type=1100 audit(1656469160.416:402): pid=707085 uid=0
auid=4294967295 ses=4294967295 msg='op=PAM:authentication
grantors=pam_shells,pam_faillock,pam_permit,pam_faillock acct="junior"
exe="/usr/bin/sshd" hostname=192.168.10.47 addr=192.168.10.47 terminal=ssh
res=success'
[11118.769433] audit: type=1101 audit(1656469160.416:403): pid=707085 uid=0
auid=4294967295 ses=4294967295 msg='op=PAM:accounting
grantors=pam_access,pam_unix,pam_permit,pam_time acct="junior"
exe="/usr/bin/sshd" hostname=192.168.10.47 addr=192.168.10.47 terminal=ssh
res=success'
[11118.769972] audit: type=1103 audit(1656469160.418:404): pid=707085 uid=0
auid=4294967295 ses=4294967295 msg='op=PAM:setcred
grantors=pam_shells,pam_faillock,pam_permit,pam_faillock acct="junior"
exe="/usr/bin/sshd" hostname=192.168.10.47 addr=192.168.10.47 terminal=ssh
res=success'
[11118.770029] audit: type=1006 audit(1656469160.418:405): pid=707085 uid=0
old-auid=4294967295 auid=1000 tty=(none) old-ses=4294967295 ses=5 res=1
[11118.770038] audit: type=1300 audit(1656469160.418:405): arch=c000003e
syscall=1 success=yes exit=4 a0=3 a1=7ffd3b3d22d0 a2=4 a3=7ffd3b3d1fe4 items=0
ppid=759 pid=707085 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0
fsgid=0 tty=(none) ses=5 comm="sshd" exe="/usr/bin/sshd" key=(null)
[11118.770040] audit: type=1327 audit(1656469160.418:405):
proctitle=737368643A206A756E696F72205B707269765D
[11118.785798] audit: type=1105 audit(1656469160.434:406): pid=707085 uid=0
auid=1000 ses=5 msg='op=PAM:session_open
grantors=pam_loginuid,pam_keyinit,pam_systemd_home,pam_limits,pam_unix,pam_permit,pam_mail,pam_systemd,pam_env
acct="junior" exe="/usr/bin/sshd" hostname=192.168.10.47 addr=192.168.10.47
terminal=ssh res=success'
[11118.786714] audit: type=1103 audit(1656469160.434:407): pid=707087 uid=0
auid=1000 ses=5 msg='op=PAM:setcred
grantors=pam_shells,pam_faillock,pam_permit,pam_faillock acct="junior"
exe="/usr/bin/sshd" hostname=192.168.10.47 addr=192.168.10.47 terminal=ssh
res=success'
[11124.189733] [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for
more than 20secs aborting
[11124.189930] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios
stuck executing D718 (len 824, WS 0, PS 0) @ 0xD898
[11124.190079] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios
stuck executing D5D2 (len 326, WS 0, PS 0) @ 0xD6C2
[11124.190230] [drm:dce110_link_encoder_disable_output [amdgpu]] *ERROR*
dce110_link_encoder_disable_output: Failed to execute VBIOS command table!
[11126.469943] audit: type=1101 audit(1656469168.118:408): pid=707219 uid=1000
auid=1000 ses=5 msg='op=PAM:accounting grantors=pam_unix,pam_permit,pam_time
acct="junior" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/0
res=success'
[11126.470552] audit: type=1110 audit(1656469168.118:409): pid=707219 uid=1000
auid=1000 ses=5 msg='op=PAM:setcred
grantors=pam_faillock,pam_permit,pam_env,pam_faillock acct="root"
exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/0 res=success'
[11126.472793] audit: type=1105 audit(1656469168.120:410): pid=707219 uid=1000
auid=1000 ses=5 msg='op=PAM:session_open
grantors=pam_systemd_home,pam_limits,pam_unix,pam_permit acct="root"
exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/0 res=success'
[11126.492151] audit: type=1106 audit(1656469168.139:411): pid=707219 uid=1000
auid=1000 ses=5 msg='op=PAM:session_close
grantors=pam_systemd_home,pam_limits,pam_unix,pam_permit acct="root"
exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/0 res=success'
[11126.492202] audit: type=1104 audit(1656469168.139:412): pid=707219 uid=1000
auid=1000 ses=5 msg='op=PAM:setcred
grantors=pam_faillock,pam_permit,pam_env,pam_faillock acct="root"
exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/0 res=success'
[11144.191100] [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for
more than 20secs aborting
[11144.191292] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios
stuck executing C16E (len 62, WS 0, PS 0) @ 0xC18A
[11164.192468] [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for
more than 20secs aborting
[11164.192658] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios
stuck executing B190 (len 1227, WS 8, PS 8) @ 0xB418
[11164.192828] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11164.192831] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11164.192833] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11164.201396] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend
of IP block <vce_v3_0> failed -110
[11164.216360] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11164.216364] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11164.216366] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11164.216368] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11164.216370] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11164.216371] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11164.216373] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11164.216375] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11164.216377] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11164.216378] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11164.436229] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11164.436234] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11164.436236] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11164.436238] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11164.436240] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11164.436241] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11164.436243] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11164.436246] amdgpu 0000:02:00.0: amdgpu: 
               last message was failed ret is 65535
[11164.436248] amdgpu: Failed to force to switch arbf0!
[11164.436249] amdgpu: [disable_dpm_tasks] Failed to disable DPM!
[11164.436250] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend
of IP block <powerplay> failed -22
[11164.546720] amdgpu 0000:02:00.0: [drm:amdgpu_ring_test_helper [amdgpu]]
*ERROR* ring kiq_2.1.0 test failed (-110)
[11164.546864] [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
[11164.767164] amdgpu: cp is busy, skip halt cp
[11164.877251] amdgpu: rlc is busy, skip halt rlc
[11164.988549] CPU: 2 PID: 705317 Comm: kworker/u48:4 Tainted: G           OE  
  5.18.7-262-tkg-pds #1 ab3a1701b6bb2d2603e5fe14656a947bbae77de2
[11164.988553] Hardware name: ATERMITER ZX-99EV3/ZX-99EV3, BIOS X99AT011
10/15/2020
[11164.988554] Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
[11164.988561] Call Trace:
[11164.988562]  <TASK>
[11164.988563]  dump_stack_lvl+0x48/0x5d
[11164.988570]  amdgpu_do_asic_reset+0x2a/0x470 [amdgpu
d2028a110b701082c428a38d2a7699ba96e2f894]
[11164.988790]  amdgpu_device_gpu_recover_imp.cold+0x537/0x8cc [amdgpu
d2028a110b701082c428a38d2a7699ba96e2f894]
[11164.989002]  amdgpu_job_timedout+0x18c/0x1c0 [amdgpu
d2028a110b701082c428a38d2a7699ba96e2f894]
[11164.989183]  drm_sched_job_timedout+0x76/0x100 [gpu_sched
ca892a3eb32539b04f830de75b342015ecf19774]
[11164.989188]  process_one_work+0x1c7/0x380
[11164.989192]  worker_thread+0x51/0x380
[11164.989195]  ? rescuer_thread+0x3a0/0x3a0
[11164.989197]  kthread+0xde/0x110
[11164.989200]  ? kthread_complete_and_exit+0x20/0x20
[11164.989203]  ret_from_fork+0x22/0x30
[11164.989208]  </TASK>
[11164.989212] amdgpu 0000:02:00.0: amdgpu: BACO reset
[drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:53:crtc-0] hw_done or
flip_done timed out
[11187.893035] radeon-profile[54935]: segfault at 0 ip 00007fe553eee6ef sp
00007ffc8035f9e0 error 4 in libQt5Core.so.5.15.5[7fe553e9f000+2d6000]
[11187.893049] Code: 38 64 48 8b 04 25 28 00 00 00 48 89 44 24 28 31 c0 e8 d5
98 ff ff 48 85 c0 0f 84 f2 3c fb ff 48 89 c3 4c 8d 68 50 48 8b 40 50 <49> 63 2c
24 3b 68 04 7d 78 8b 10 83 fa 01 76 26 8b 70 08 81 e6 ff

[drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for more than 20secs
aborting
[11206.839405] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios
stuck executing C16E (len 62, WS 0, PS 0) @ 0xC18A
[11206.839546] [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios
stuck executing AB18 (len 142, WS 0, PS 8) @ 0xAB33
[11206.839688] amdgpu 0000:02:00.0: amdgpu: asic atom init failed!
[11206.839725] amdgpu 0000:02:00.0: amdgpu: GPU reset(2) failed
[11206.839746] amdgpu 0000:02:00.0: amdgpu: GPU reset end with ret = -22
[11206.839748] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* GPU Recovery Failed:
-22

[11216.913239] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout,
signaled seq=2678113, emitted seq=2678113
[11216.913503] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information:
process ArcheAge.exe pid 702264 thread ArcheAge.e:cs0 pid 703382
[11216.913700] amdgpu 0000:02:00.0: amdgpu: GPU reset begin!

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.



[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux