Currently the FW loading path perform some checks based on IP model and in case it is advertised as supported, the VCN indirect SRAM mode is used. Happens that in case there's any issue on FW and this mode ends-up not being properly supported, the driver probe fails [0]. Debugging this requires driver rebuilding, so to allow fast debug and experiments, add a parameter to force setting indirect SRAM mode to true/false from the kernel command-line; parameter default is -1, which doesn't change the current driver's behavior. [0] Example of this issue, observed on Steam Deck: [drm] kiq ring mec 2 pipe 1 q 0 [drm] failed to load ucode VCN0_RAM(0x3A) [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0000) amdgpu 0000:04:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring vcn_dec_0 test failed (-110) [drm:amdgpu_device_init.cold [amdgpu]] *ERROR* hw_init of IP block <vcn_v3_0> failed -110 amdgpu 0000:04:00.0: amdgpu: amdgpu_device_ip_init failed amdgpu 0000:04:00.0: amdgpu: Fatal error during GPU init Cc: James Zhu <James.Zhu@xxxxxxx> Cc: Lazar Lijo <Lijo.Lazar@xxxxxxx> Cc: Leo Liu <leo.liu@xxxxxxx> Cc: Mario Limonciello <mario.limonciello@xxxxxxx> Cc: Sonny Jiang <sonny.jiang@xxxxxxx> Signed-off-by: Guilherme G. Piccoli <gpiccoli@xxxxxxxxxx> --- This work is based on agd5f/amd-staging-drm-next branch. Thanks in advance for reviews/comments! Cheers, Guilherme drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 9 +++++++++ drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 3 +++ 3 files changed, 13 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 872450a3a164..5d3c92c94f18 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -215,6 +215,7 @@ extern int amdgpu_noretry; extern int amdgpu_force_asic_type; extern int amdgpu_smartshift_bias; extern int amdgpu_use_xgmi_p2p; +extern int amdgpu_indirect_sram; #ifdef CONFIG_HSA_AMD extern int sched_policy; extern bool debug_evictions; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 06aba201d4db..c7182c0bc841 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -187,6 +187,7 @@ int amdgpu_num_kcq = -1; int amdgpu_smartshift_bias; int amdgpu_use_xgmi_p2p = 1; int amdgpu_vcnfw_log; +int amdgpu_indirect_sram = -1; static void amdgpu_drv_delayed_reset_work_handler(struct work_struct *work); @@ -941,6 +942,14 @@ MODULE_PARM_DESC(smu_pptable_id, "specify pptable id to be used (-1 = auto(default) value, 0 = use pptable from vbios, > 0 = soft pptable id)"); module_param_named(smu_pptable_id, amdgpu_smu_pptable_id, int, 0444); +/** + * DOC: indirect_sram (int) + * Allow users to force using (or not) the VCN indirect SRAM mode in the fw load + * code. Default is -1, meaning auto (aka, don't mess with driver's behavior). + */ +MODULE_PARM_DESC(indirect_sram, "Force VCN indirect SRAM (-1 = auto (default), 0 = disabled, 1 = enabled)"); +module_param_named(indirect_sram, amdgpu_indirect_sram, int, 0444); + /* These devices are not supported by amdgpu. * They are supported by the mach64, r128, radeon drivers */ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c index 1f880e162d9d..a2290087e01c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c @@ -137,6 +137,9 @@ int amdgpu_vcn_sw_init(struct amdgpu_device *adev) return -EINVAL; } + if (amdgpu_indirect_sram >= 0) + adev->vcn.indirect_sram = (bool)amdgpu_indirect_sram; + hdr = (const struct common_firmware_header *)adev->vcn.fw->data; adev->vcn.fw_version = le32_to_cpu(hdr->ucode_version); -- 2.39.0