[Bug 107762] [Intel GFX CI] *ERROR* ring sdma0 timeout, signaled seq=137, emitted seq=137

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



changed bug 107762
What Removed Added
CC   ckoenig.leichtzumerken@gmail.com, dev@lynxeye.de

Comment # 2 on bug 107762 from
(In reply to Martin Peres from comment #0)
> [  358.292609] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=137, emitted seq=137
> [  358.292635] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma1 timeout, signaled seq=145, emitted seq=145

(In reply to Martin Peres from comment #1)
> [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=137, emitted seq=137
> [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=147, emitted seq=147

Hmm, signalled and emitted sequence numbers are always the same, meaning the
hardware hasn't actually timed out?

I can think of two possibilities:

* A GPU scheduler bug causing the job timeout handling to be triggered
spuriously. (Could something be stalling the system work queue, so the items
scheduled by drm_sched_job_finish_cb can't call drm_sched_job_finish in time?)

* A problem with the handling of the GPU's interrupts. Do the numbers on the
amdgpu line in /proc/interrupts still increase after these messages appeared,
or at least in the ten seconds before they appear?


You are receiving this mail because:
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux