Comment # 3
on bug 106500
from Andrey Grodzovsky
(In reply to Bas Nieuwenhuizen from comment #2) > Created attachment 139568 [details] > dmesg after trying 139562 > > I tried the patch and as expected we do not deadlock at the original places > since we don't call those anymore. But I get garbage on my display (possibly > expected due to loss of VRAM), can't switch VT and stopping X hangs X. > > Furthermore I eventually still get stuck fence waits in dmesg (attached). > > Furthermore, it seems the UVDF ringtest fails. I think indeed the garbage is due to VRAM lost, maybe we don't create a shadow BO for the display's BO. GPU reset fails due to UVD failure to resume and SMU failure so I believe that why any further fence submission hangs. The pipe never recovers. Harry, check the patch I attached, no reason to call drm_atomic_helper_resume/suspend explicitly from amdgpu_device_gpu_recover - First of all it's already being called from the display code from amd_ip_funcs.suspend/resume hooks. Second of all, the place in amdgpu_device_gpu_recover it's being called is wrong for GPU stalls since it is called BEFORE we cancel and force completion of all in flight jobs which are stuck on the GPU. So as Bas explained it will try to wait for fence in amdgpu_pm_compute_clocks but the pipe is hanged so we end up in deadlock. If we call the mode set AFTER forceful completion (as the patch makes happen) no deadlock will happen. UVD/SMU failures require further debugging but I am on a different task at the moment so maybe some one can pick this up... Do you remember why that code is there ? I think it's remains of old code. If you OK with this patch I will send it for review. Further
You are receiving this mail because:
- You are the assignee for the bug.
_______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel