Hi Michel, Christian, Michel, I have tested amd-staging-drm-next at commit "drm/amdgpu/gfx9: only init the apertures used by KGD (v2)" - 0e4946409d11913523d30bc4830d10b388438c7a and the issues remain, both on ARMv7 and on x86 amd64. Christian, in fact if I replay the apitraces obtained on the ARMv7 platform on the AMD64 I am also able to reproduce the GPU hang! So it is not ARM platform specific. Should I send/upload the apitraces? I have two of them, typically when one doesn't hang the gpu the other hangs. One takes about 1GB of disk space while the other takes 2.3GB. ... [ 69.019381] ISO 9660 Extensions: RRIP_1991A [ 213.292094] DMAR: DRHD: handling fault status reg 2 [ 213.292102] DMAR: [INTR-REMAP] Request device [00:00.0] fault index 1c [fault reason 38] Blocked an interrupt request due to source-id verification failure [ 223.406919] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, last signaled seq=25158, last emitted seq=25160 [ 223.406926] [drm] IP block:tonga_ih is hung! [ 223.407167] [drm] GPU recovery disabled. Regards, LuÃs On Wed, Jan 3, 2018 at 5:47 PM, LuÃs Mendes <luis.p.mendes at gmail.com> wrote: > Hi Michel, Christian, > > Christian, I have followed your suggestion and I have just submitted a > bug to fdo at https://bugs.freedesktop.org/show_bug.cgi?id=104481 - > GPU lockup Polaris 11 - AMD RX 460 and RX 550 on amd64 and on ARMv7 > platforms while playing video. > > Michel, amdgpu.dc=0 seems to make no difference. I will try > amd-staging-drm-next and report back. > > Regards, > LuÃs > > On Wed, Jan 3, 2018 at 5:09 PM, Michel Dänzer <michel at daenzer.net> wrote: >> On 2018-01-03 12:02 PM, LuÃs Mendes wrote: >>> >>> What I believe it seems to be the case is that the GPU lock up only >>> happens when doing a page flip, since the kernel locks with: >>> [ 243.693200] kworker/u4:3 D 0 89 2 0x00000000 >>> [ 243.693232] Workqueue: events_unbound commit_work [drm_kms_helper] >>> [ 243.693251] [<80b8c6d4>] (__schedule) from [<80b8cdd0>] (schedule+0x4c/0xac) >>> [ 243.693259] [<80b8cdd0>] (schedule) from [<80b91024>] >>> (schedule_timeout+0x228/0x444) >>> [ 243.693270] [<80b91024>] (schedule_timeout) from [<80886738>] >>> (dma_fence_default_wait+0x2b4/0x2d8) >>> [ 243.693276] [<80886738>] (dma_fence_default_wait) from [<80885d60>] >>> (dma_fence_wait_timeout+0x40/0x150) >>> [ 243.693284] [<80885d60>] (dma_fence_wait_timeout) from [<80887b1c>] >>> (reservation_object_wait_timeout_rcu+0xfc/0x34c) >>> [ 243.693509] [<80887b1c>] (reservation_object_wait_timeout_rcu) from >>> [<7f331988>] (amdgpu_dm_do_flip+0xec/0x36c [amdgpu]) >>> [ 243.693789] [<7f331988>] (amdgpu_dm_do_flip [amdgpu]) from >>> [<7f33309c>] (amdgpu_dm_atomic_commit_tail+0xbfc/0xe58 [amdgpu]) >>> ... >> >> Does the problem also occur if you disable DC with amdgpu.dc=0 on the >> kernel command line? >> >> Does it also happen with a kernel built from the amd-staging-drm-next >> branch instead of drm-next-4.16? >> >> >> -- >> Earthling Michel Dänzer | http://www.amd.com >> Libre software enthusiast | Mesa and X developer