[Bug 111807] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout cause process into Disk sleep state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Bug ID 111807
Summary [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout cause process into Disk sleep state
Product DRI
Version DRI git
Hardware ARM
OS Linux (All)
Status NEW
Severity major
Priority not set
Component DRM/AMDgpu
Assignee dri-devel@lists.freedesktop.org
Reporter liansz@fzcyjh.com

Created attachment 145506 [details]
timeoutlog

We ran into some gfx timeout problems.
Currently, we use the kernel of 4.19.36. We merged some patches regarding GPU
from the community. There are multiple GPUs on each server, and each GPU is
running some rendering programs. Now, there are 2 different cases of failures.
The first one is that one graphics card of a server fails, rendering program
does not have a D state, and it shows error code 110 tested by
/sys/kernel/debug/dri/1/amdgpu_test_ib, then shows pass after a second test.
See tmp-618-2.zip for details.
The second one is that one graphics card of a server fails, the whole rendering
program running on the server fails and has D state. It fails at drm_release.
See tmp-619.zip for details.
Could you please help us out?


You are receiving this mail because:
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux