[Bug 204241] amdgpu fails to resume from suspend

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



https://bugzilla.kernel.org/show_bug.cgi?id=204241

Ahzo@xxxxxxxxxxxx changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #285349|0                           |1
        is obsolete|                            |

--- Comment #20 from Ahzo@xxxxxxxxxxxx ---
Created attachment 285469
  --> https://bugzilla.kernel.org/attachment.cgi?id=285469&action=edit
Patch to fix the resume failures

(In reply to Alex Deucher from comment #17)
> I'm not sure I understand why the patch helps.  You are just changing the
> order of two memory allocations.  The order shouldn't matter.

My hypothesis is that the order here is not the root cause of the problem, but
rather affects the likelihood of that manifesting itself.
This is based on the fact that I have seen a resume failure typical for this
bug on linux 5.0 once, but I'm unable to reproduce it with that version.

As commit 533aed278afe apparently makes the failures much more likely to
happen, it provides an opportunity to debug this further by backporting it to
older linux versions.
Doing that for versions down to linux 4.15 exposes the resume failures, but not
on linux 4.14.

A bisection between these two, while backporting 533aed278afe on every step,
lead to commit 2a91f272e34c, which failed to boot and thus had to be skipped,
and:
commit e0128efb08b3d628d767ec8578e77cdd7ecc8f81
Author: James Zhu <James.Zhu@xxxxxxx>
Date:   Fri Sep 29 16:42:27 2017 -0400

    drm/amdgpu: add uvd enc ib test

    Generate create/destroy messages to test UVD encode indirect buffer
function.
    And enable UVD encode IB test during device initialization.

    Signed-off-by: James Zhu <James.Zhu@xxxxxxx>
    Reviewed-and-Tested-by: Leo Liu <leo.liu@xxxxxxx>
    Reviewed-by: Christian König <christian.koenig@xxxxxxx>
    Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>

This looks like a likely root cause. Indeed, adding 'return 0;' at the
beginning of uvd_v6_0_enc_ring_test_ib makes the problem unreproducible, even
on the latest linux 5.4-rc2.

Comparing with amdgpu_uvd_get_{create,destroy}_msg shows that these use 0 as
dummy GPU pointer, while uvd_v6_0_enc_get_{create,destroy}_msg use a real GPU
memory address.
Changing them to also use 0 as dummy pointer, as is done in the attached patch,
actually fixes the resume failures.

Maybe a similar change should also be made for UVD 7.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux