Comment # 8
on bug 107152
from Andrey Grodzovsky
dwanger, i think you already have all the trace tools installed from previous debug sessions so this should be quick for you - Update to latest kernel from https://cgit.freedesktop.org/~agd5f/linux/log/?h=amd-staging-drm-next Load the system and before starting reproduce run the following trace command - sudo trace-cmd start -e dma_fence -e gpu_scheduler -e amdgpu -v -e "amdgpu:amdgpu_mm_rreg" -e "amdgpu:amdgpu_mm_wreg" -e "amdgpu:amdgpu_iv" after VM_FAULT happened extract the log from /sys/kernel/debug/tracing also run sudo umr -O verbose -R gfx[.] sudo umr -O halt_waves -wa Now let's say this your log crash Jul 07 01:08:20 ryzen kernel: amdgpu 0000:0a:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00100190 Jul 07 01:08:20 ryzen kernel: amdgpu 0000:0a:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0E04400C Jul 07 01:08:20 ryzen kernel: amdgpu 0000:0a:00.0: VM fault (0x0c, vmid 7, pasid 32768) at page 1048976, read from 'TC1' (0x54433100) (68) Do umr -O verbose -vm 7@100190000 1 where 7 is vmid value and 100190000 is VM_CONTEXT1_PROTECTION_FAULT_ADDR value with extra '000' to get from virtual page number to actual virtual address (left shift 4096b). I can look at the log then and also run it by our MESA/LLVM experts to try and figure out what's going on.
You are receiving this mail because:
- You are the assignee for the bug.
_______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel