[Bug 112242] amdgpu [RX Vega 56]: ring sdma0 timeout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Bug ID 112242
Summary amdgpu [RX Vega 56]: ring sdma0 timeout
Product DRI
Version unspecified
Hardware x86-64 (AMD64)
OS Linux (All)
Status NEW
Severity major
Priority not set
Component DRM/AMDgpu
Assignee dri-devel@lists.freedesktop.org
Reporter mh@familie-heinz.name

Hi,

I've reported this over at bugzilla.kernel.org but didn't get any help there.
Maybe because nobody is expecting bugreports about the amdgpu driver over on
the kernels bugtracker?

So this started a while ago, when I updated from 5.0.0 to a newer kernel. I'm
currently at 5.3.0 and for almost any game I play I run into this problem:

Aug 24 11:13:33 egalite kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring
sdma0 timeout, signaled seq=368056, emitted seq=368057
Aug 24 11:13:33 egalite kernel: [drm:drm_atomic_helper_wait_for_flip_done
[drm_kms_helper]] *ERROR* [CRTC:47:crtc-0] flip_done timed out
Aug 24 11:13:33 egalite kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
Process information: process 7DaysToDie.x86_ pid 8108 thread 7DaysToDie:cs0
Aug 24 11:13:33 egalite kernel: amdgpu 0000:0c:00.0: GPU reset begin!
Aug 24 11:13:33 egalite kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring
gfx timeout, but soft recovered

Only a hard reset made me recover from that.

I did some kernel traces which I will copy over to this report, if necessary,
but for now you can download them here:
https://bugzilla.kernel.org/show_bug.cgi?id=204683

It also looks a bit like this bug:
https://bugzilla.kernel.org/show_bug.cgi?id=201957 , because I also get the
"ring gfx timeout". And there are lots and lots of people having this issue.

I tried bisecting it, but failed, because either I missed the commit that
causes this, because there are multiple reasons why this happens or this really
goes way back to the time, where 4.18 was the base for drm-next (which doesn't
compile on modern compilers anymore. Also steam doesn't want to run on those
old kernels, so even when I was able to compile an older kernel, there was no
way to test them)

I even tried debugging it over ethernet (KGDBoE is a nice thing if you need
performance), but somehow this slowed everything down enough to not trigger the
bug.

I also tried the suggestions from
https://bugs.freedesktop.org/show_bug.cgi?id=109955, but forbidding the lowest
clock mode doesn't help either. (It fixes my RocketLeague problems, though).

Please advise what I should try next.

Best regards
Matthias


You are receiving this mail because:
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux