Comment # 57
on bug 102322
from Andrey Grodzovsky
(In reply to dwagner from comment #56) > (In reply to Andrey Grodzovsky from comment #55) > > > In above attached file "xz-compressed output of gpu_debug3.sh" there is umr > > > output at the time of the crash (238 seconds after the reboot): > > > > > > ---------------------------------------------- > > > ... > > > mpv/vo-897 [005] .... 235.191542: dma_fence_wait_start: > > > driver=drm_sched timeline=gfx context=162 seqno=87 > > > mpv/vo-897 [005] d... 235.191548: dma_fence_enable_signal: > > > driver=drm_sched timeline=gfx context=162 seqno=87 > > > kworker/0:2-92 [000] .... 238.275988: dma_fence_signaled: > > > driver=amdgpu timeline=sdma1 context=11 seqno=210 > > > kworker/0:2-92 [000] .... 238.276004: dma_fence_signaled: > > > driver=amdgpu timeline=sdma1 context=11 seqno=211 > > > [ 238.180634] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 > > > timeout, signaled seq=32624, emitted seq=32626 > > > [ 238.180641] amdgpu 0000:0a:00.0: GPU reset begin! > > > [ 238.180641] amdgpu 0000:0a:00.0: GPU reset begin! > > > > > > crash detected! > > > > > > executing umr -O halt_waves -wa > > > No active waves! > > > > Did you use amdgpu.vm_fault_stop=2 parameter ? In case a fault happened that > > should have froze GPUs compute units and hence the above command would > > produce a lot of wave info. > > Yes I did, as can be seen from the kernel command line at the very beginning > of the file I attached: > [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-linux_amd > root=UUID=b5d56e15-18f3-4783-af84-bbff3bbff3ef rw > cryptdevice=/dev/nvme0n1p2:root:allow-discards libata.force=1.5 video=DP-1:d > video=DVI-D-1:d video=HDMI-A-1:1024x768 amdgpu.dc=1 amdgpu.vm_update_mode=0 > amdgpu.dpm=-1 amdgpu.ppfeaturemask=0xffffffff amdgpu.vm_fault_stop=2 > amdgpu.vm_debug=1 > > Could the "amdgpu 0000:0a:00.0: GPU reset begin!" message indicate a > procedure that discards whatever has been in thoses "waves" before? If yes, > could amdgpu.gpu_recovery=0 prevent that from happening? Yes, missed that one. No resets.
You are receiving this mail because:
- You are the assignee for the bug.
_______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel