Comment # 55
on bug 102322
from Andrey Grodzovsky
(In reply to dwagner from comment #54) > (In reply to Andrey Grodzovsky from comment #53) > > Created attachment 141198 [details] [review] [review] [review] > > add_debug_info2.patch > > > > Try this patch instead, i might be missing some prints in the first one. > > Can try that this evening. > > > In the last log you attached I haven't seen any UMR dumps or GPU fault > > prints in dmesg. THe GPU fault has to be in the log to compare the faulty > > address against the debug prints in the patch. > > In above attached file "xz-compressed output of gpu_debug3.sh" there is umr > output at the time of the crash (238 seconds after the reboot): > > ---------------------------------------------- > ... > mpv/vo-897 [005] .... 235.191542: dma_fence_wait_start: > driver=drm_sched timeline=gfx context=162 seqno=87 > mpv/vo-897 [005] d... 235.191548: dma_fence_enable_signal: > driver=drm_sched timeline=gfx context=162 seqno=87 > kworker/0:2-92 [000] .... 238.275988: dma_fence_signaled: > driver=amdgpu timeline=sdma1 context=11 seqno=210 > kworker/0:2-92 [000] .... 238.276004: dma_fence_signaled: > driver=amdgpu timeline=sdma1 context=11 seqno=211 > [ 238.180634] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 > timeout, signaled seq=32624, emitted seq=32626 > [ 238.180641] amdgpu 0000:0a:00.0: GPU reset begin! > [ 238.180641] amdgpu 0000:0a:00.0: GPU reset begin! > > crash detected! > > executing umr -O halt_waves -wa > No active waves! Did you use amdgpu.vm_fault_stop=2 parameter ? In case a fault happened that should have froze GPUs compute units and hence the above command would produce a lot of wave info. > > > executing umr -O verbose -R gfx[.] > > polaris11.gfx.rptr == 1792 > polaris11.gfx.wptr == 1792 > polaris11.gfx.drv_wptr == 1792 > polaris11.gfx.ring[1761] == 0xffff1000 ... > polaris11.gfx.ring[1762] == 0xffff1000 ... > polaris11.gfx.ring[1763] == 0xffff1000 ... > polaris11.gfx.ring[1764] == 0xffff1000 ... > polaris11.gfx.ring[1765] == 0xffff1000 ... > polaris11.gfx.ring[1766] == 0xffff1000 ... > polaris11.gfx.ring[1767] == 0xffff1000 ... > polaris11.gfx.ring[1768] == 0xffff1000 ... > polaris11.gfx.ring[1769] == 0xffff1000 ... > polaris11.gfx.ring[1770] == 0xffff1000 ... > polaris11.gfx.ring[1771] == 0xffff1000 ... > polaris11.gfx.ring[1772] == 0xffff1000 ... > polaris11.gfx.ring[1773] == 0xffff1000 ... > polaris11.gfx.ring[1774] == 0xffff1000 ... > polaris11.gfx.ring[1775] == 0xffff1000 ... > polaris11.gfx.ring[1776] == 0xffff1000 ... > polaris11.gfx.ring[1777] == 0xffff1000 ... > polaris11.gfx.ring[1778] == 0xffff1000 ... > polaris11.gfx.ring[1779] == 0xffff1000 ... > polaris11.gfx.ring[1780] == 0xffff1000 ... > polaris11.gfx.ring[1781] == 0xffff1000 ... > polaris11.gfx.ring[1782] == 0xffff1000 ... > polaris11.gfx.ring[1783] == 0xffff1000 ... > polaris11.gfx.ring[1784] == 0xffff1000 ... > polaris11.gfx.ring[1785] == 0xffff1000 ... > polaris11.gfx.ring[1786] == 0xffff1000 ... > polaris11.gfx.ring[1787] == 0xffff1000 ... > polaris11.gfx.ring[1788] == 0xffff1000 ... > polaris11.gfx.ring[1789] == 0xffff1000 ... > polaris11.gfx.ring[1790] == 0xffff1000 ... > polaris11.gfx.ring[1791] == 0xffff1000 ... > polaris11.gfx.ring[1792] == 0xc0032200 rwD > > trying to get ADR from dmesg output for 'umr -O verbose -vm ...' > trying to get VMID from dmesg output for 'umr -O verbose -vm ...' > > done after crash, flashing NUMLOCK LED. > amdgpu_cs:0-799 [001] .... 286.852838: amdgpu_bo_list_set: > list=0000000099c16b5c, bo=000000001771c26f, bo_size=131072 > amdgpu_cs:0-799 [001] .... 286.852846: amdgpu_bo_list_set: > list=0000000099c16b5c, bo=0000000046bfd439, bo_size=131072 > ... > ---------------------------------------------- > > But sure, there were no "VM_CONTEXT1_PROTECTION_FAULT_ADDR" error messages > this time. Sometimes such are emitted, sometimes not.
You are receiving this mail because:
- You are the assignee for the bug.
_______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel