kernel 5.15.x: AMD RX 6700 XT - Fails to resume after screen blank

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

TL;DR - git bisection points to https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.15.4&id=61d861cf478576d85d6032f864360a34b26084b1 as causing an issue when changing power state after idle.

Since 5.15.0 I have had intermittent issues with my GPU failing to resume after entering power saving. I have errors like these:

Nov 18 09:52:19 katana kernel: [ 4921.669813] [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3 Nov 18 09:52:21 katana kernel: [ 4923.667803] snd_hda_intel 0000:0d:00.1: refused to change power state from D0 to D3hot Nov 18 09:52:26 katana kernel: [ 4928.622234] amdgpu 0000:0d:00.0: amdgpu: Failed to export SMU metrics table! Nov 18 09:52:31 katana kernel: [ 4933.371814] [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3 Nov 18 09:52:31 katana kernel: [ 4933.650854] [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3 Nov 18 09:52:32 katana kernel: [ 4933.921708] [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3 Nov 18 09:52:32 katana kernel: [ 4933.940249] amdgpu 0000:0d:00.0: amdgpu: SMU: I'm not done with your previous command! Nov 18 09:52:32 katana kernel: [ 4933.940254] amdgpu 0000:0d:00.0: amdgpu: Failed to export SMU metrics table! Nov 18 09:52:32 katana kernel: [ 4934.192236] [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3 Nov 18 09:52:32 katana kernel: [ 4934.463213] [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3 Nov 18 09:52:33 katana kernel: [ 4934.736895] [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3 Nov 18 09:52:33 katana kernel: [ 4935.007928] [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3 Nov 18 09:52:33 katana kernel: [ 4935.279063] [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3 Nov 18 09:52:33 katana kernel: [ 4935.550243] [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3 Nov 18 09:52:34 katana kernel: [ 4935.824034] [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3 Nov 18 09:52:34 katana kernel: [ 4936.095158] [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3 Nov 18 09:52:34 katana kernel: [ 4936.366210] [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3 Nov 18 09:52:34 katana kernel: [ 4936.629193] [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3 Nov 18 09:52:35 katana kernel: [ 4936.886333] [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3 Nov 18 09:52:35 katana kernel: [ 4937.140815] [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3 Nov 18 09:52:35 katana kernel: [ 4937.395341] [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3 Nov 18 09:52:35 katana kernel: [ 4937.649885] [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3 Nov 18 09:52:36 katana kernel: [ 4937.906944] [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3 Nov 18 09:52:36 katana kernel: [ 4938.162866] [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3

this eventually leads to processes crashing, and the system locking up during shutdown.

A git bisection has isolated the following patch as the cause.

commit 8f0284f190e6a0aa09015090568c03f18288231a (refs/bisect/bad)
Merge: 5bea1c8ce673 61d861cf4785
Author: Dave Airlie <airlied@xxxxxxxxxx>
Date:   Mon Aug 30 09:06:01 2021 +1000

    Merge tag 'amd-drm-next-5.15-2021-08-27' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

    amd-drm-next-5.15-2021-08-27:

    amdgpu:
    - PLL fix for SI
    - Misc code cleanups
    - RAS fixes
    - PSP cleanups
    - Polaris UVD/VCE suspend fixes
    - aldebaran fixes
    - DCN3.x mclk fixes

    amdkfd:
    - CWSR fixes for arcturus and aldebaran
    - SVM fixes

    Signed-off-by: Dave Airlie <airlied@xxxxxxxxxx>
    From: Alex Deucher <alexander.deucher@xxxxxxx>
    Link: https://patchwork.freedesktop.org/patch/msgid/20210827192336.4649-1-alexander.deucher@xxxxxxx

commit 61d861cf478576d85d6032f864360a34b26084b1 (HEAD)
Author: Nicholas Kazlauskas <nicholas.kazlauskas@xxxxxxx>
Date:   Wed May 13 11:58:50 2020 -0400

    drm/amd/display: Move AllowDRAMSelfRefreshOrDRAMClockChangeInVblank to bounding box

    [Why]
    This is a global parameter, not a per pipe parameter and it's useful
    for experimenting with the prefetch schedule to be adjustable from
    the SOC bb.

    [How]
    Add a parameter to the SOC bb, default is the existing policy for
    all DCN. Fill it in when filling SOC bb parameters.

    Revert the policy to use MinDCFClk at the same time since that's not
    going to give us P-State in most cases on the spreadsheet.

    Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1403
    Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@xxxxxxx>
    Signed-off-by: Aurabindo Pillai <aurabindo.pillai@xxxxxxx>
    Tested-by: Daniel Wheeler <Daniel.Wheeler@xxxxxxx>
    Acked-by: Alex Deucher <alexander.deucher@xxxxxxx>
    Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>

I have been running 5.15.4 with 61d861cf478576d85d6032f864360a34b26084b1 backed out for a few hours with multiple periods of power saving, and so far so good.

Cheers,

Mark





[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux