[Bug 112226] [HadesCanyon] GPU hangs don't anymore recover (although kernel still claims that they do)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Bug ID 112226
Summary [HadesCanyon] GPU hangs don't anymore recover (although kernel still claims that they do)
Product DRI
Version DRI git
Hardware x86-64 (AMD64)
OS Linux (All)
Status NEW
Severity critical
Priority not set
Component DRM/AMDgpu
Assignee dri-devel@lists.freedesktop.org
Reporter eero.t.tamminen@intel.com

Setup:
* HW: KBL HadesCanyon (i7-8809G with Radeon RX Vega M GH)
* OS: Ubuntu 18.04 with Unity desktop (compiz)
* SW: Git builds of drm-tip kernel, Mesa and X server

Issue:
* AMD GPU driver stopped recovering from bug 108898 KBL HadesCanyon GPU hangs.

It still claims to recover from the bug:
-------------------------------------------------------
[ 1057.512690] Iteration 2/3: bin/testfw_app --gfx glfw --gl_api desktop_core
--width 1920 --height 1080 --fullscreen 1 --test_id gl_manhattan
[ 1119.867403] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for
fences timed out!
[ 1124.987449] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but
soft recovered
-------------------------------------------------------
But now all 3D tests run after this error will fail.

This started to happen between following (drm-tip) kernel commits:
* 2019-10-28 16:01:46: 912b87256c: drm-tip: 2019y-10m-28d-16h-00m-10s UTC
integration manifest
* 2019-10-29 17:58:05: a2c9f8ce2a: drm-tip: 2019y-10m-29d-17h-57m-39s UTC
integration manifest

And following Mesa commits:
* 2019-10-28 17:47:06: d298740a1c: iris: Disallow incomplete resource creation
* 2019-10-29 16:19:34: ff6e148a3d: freedreno/a6xx: add a618 support


Note:
* I'm not seeing the same issue by using few months old Mesa with latest
drm-tip kernel, so some change in Mesa triggers this kernel issue
* If latest Mesa is used with drm-tip kernel 5.3, 4/5 times X fails to start. 
This started to happen with Mesa version within couple of days of the GPU hang
recovery issue, so potentially there are more issue in Mesa (HadesCanyon) AMD
support


You are receiving this mail because:
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux