Secure part requires PSP load VCN boot sequence which is with indirect sram mode.
Regards,
Leo
From: Alex Deucher <alexdeucher@xxxxxxxxx>
Sent: January 16, 2023 4:50 PM
To: Guilherme G. Piccoli <gpiccoli@xxxxxxxxxx>
Cc: amd-gfx@xxxxxxxxxxxxxxxxxxxxx <amd-gfx@xxxxxxxxxxxxxxxxxxxxx>; Jiang, Sonny <Sonny.Jiang@xxxxxxx>; kernel@xxxxxxxxxxxx <kernel@xxxxxxxxxxxx>; Pan, Xinhui <Xinhui.Pan@xxxxxxx>; dri-devel@xxxxxxxxxxxxxxxxxxxxx <dri-devel@xxxxxxxxxxxxxxxxxxxxx>; Lazar, Lijo <Lijo.Lazar@xxxxxxx>; Limonciello, Mario <Mario.Limonciello@xxxxxxx>; kernel-dev@xxxxxxxxxx <kernel-dev@xxxxxxxxxx>; Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Zhu, James <James.Zhu@xxxxxxx>; Liu, Leo <Leo.Liu@xxxxxxx>; Koenig, Christian <Christian.Koenig@xxxxxxx>
Subject: Re: [PATCH 3/3] drm/amdgpu/vcn: Add parameter to force (en/dis)abling indirect SRAM mode
Sent: January 16, 2023 4:50 PM
To: Guilherme G. Piccoli <gpiccoli@xxxxxxxxxx>
Cc: amd-gfx@xxxxxxxxxxxxxxxxxxxxx <amd-gfx@xxxxxxxxxxxxxxxxxxxxx>; Jiang, Sonny <Sonny.Jiang@xxxxxxx>; kernel@xxxxxxxxxxxx <kernel@xxxxxxxxxxxx>; Pan, Xinhui <Xinhui.Pan@xxxxxxx>; dri-devel@xxxxxxxxxxxxxxxxxxxxx <dri-devel@xxxxxxxxxxxxxxxxxxxxx>; Lazar, Lijo <Lijo.Lazar@xxxxxxx>; Limonciello, Mario <Mario.Limonciello@xxxxxxx>; kernel-dev@xxxxxxxxxx <kernel-dev@xxxxxxxxxx>; Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Zhu, James <James.Zhu@xxxxxxx>; Liu, Leo <Leo.Liu@xxxxxxx>; Koenig, Christian <Christian.Koenig@xxxxxxx>
Subject: Re: [PATCH 3/3] drm/amdgpu/vcn: Add parameter to force (en/dis)abling indirect SRAM mode
On Mon, Jan 16, 2023 at 4:21 PM Guilherme G. Piccoli
<gpiccoli@xxxxxxxxxx> wrote:
>
> Currently the FW loading path perform some checks based on IP model
> and in case it is advertised as supported, the VCN indirect SRAM
> mode is used.
>
> Happens that in case there's any issue on FW and this mode ends-up
> not being properly supported, the driver probe fails [0]. Debugging
> this requires driver rebuilding, so to allow fast debug and experiments,
> add a parameter to force setting indirect SRAM mode to true/false from
> the kernel command-line; parameter default is -1, which doesn't change
> the current driver's behavior.
Probably a question for Leo or James, but I was under the impression
non-indirect mode didn't work on production silicon for most chips,
but maybe I'm mis-remembering that.
Alex
>
> [0] Example of this issue, observed on Steam Deck:
>
> [drm] kiq ring mec 2 pipe 1 q 0
> [drm] failed to load ucode VCN0_RAM(0x3A)
> [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0000)
> amdgpu 0000:04:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring vcn_dec_0 test failed (-110)
> [drm:amdgpu_device_init.cold [amdgpu]] *ERROR* hw_init of IP block <vcn_v3_0> failed -110
> amdgpu 0000:04:00.0: amdgpu: amdgpu_device_ip_init failed
> amdgpu 0000:04:00.0: amdgpu: Fatal error during GPU init
>
> Cc: James Zhu <James.Zhu@xxxxxxx>
> Cc: Lazar Lijo <Lijo.Lazar@xxxxxxx>
> Cc: Leo Liu <leo.liu@xxxxxxx>
> Cc: Mario Limonciello <mario.limonciello@xxxxxxx>
> Cc: Sonny Jiang <sonny.jiang@xxxxxxx>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@xxxxxxxxxx>
> ---
>
>
> This work is based on agd5f/amd-staging-drm-next branch.
> Thanks in advance for reviews/comments!
> Cheers,
>
> Guilherme
>
>
> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 9 +++++++++
> drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 3 +++
> 3 files changed, 13 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 872450a3a164..5d3c92c94f18 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -215,6 +215,7 @@ extern int amdgpu_noretry;
> extern int amdgpu_force_asic_type;
> extern int amdgpu_smartshift_bias;
> extern int amdgpu_use_xgmi_p2p;
> +extern int amdgpu_indirect_sram;
> #ifdef CONFIG_HSA_AMD
> extern int sched_policy;
> extern bool debug_evictions;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 06aba201d4db..c7182c0bc841 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -187,6 +187,7 @@ int amdgpu_num_kcq = -1;
> int amdgpu_smartshift_bias;
> int amdgpu_use_xgmi_p2p = 1;
> int amdgpu_vcnfw_log;
> +int amdgpu_indirect_sram = -1;
>
> static void amdgpu_drv_delayed_reset_work_handler(struct work_struct *work);
>
> @@ -941,6 +942,14 @@ MODULE_PARM_DESC(smu_pptable_id,
> "specify pptable id to be used (-1 = auto(default) value, 0 = use pptable from vbios, > 0 = soft pptable id)");
> module_param_named(smu_pptable_id, amdgpu_smu_pptable_id, int, 0444);
>
> +/**
> + * DOC: indirect_sram (int)
> + * Allow users to force using (or not) the VCN indirect SRAM mode in the fw load
> + * code. Default is -1, meaning auto (aka, don't mess with driver's behavior).
> + */
> +MODULE_PARM_DESC(indirect_sram, "Force VCN indirect SRAM (-1 = auto (default), 0 = disabled, 1 = enabled)");
> +module_param_named(indirect_sram, amdgpu_indirect_sram, int, 0444);
> +
> /* These devices are not supported by amdgpu.
> * They are supported by the mach64, r128, radeon drivers
> */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> index 1f880e162d9d..a2290087e01c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> @@ -137,6 +137,9 @@ int amdgpu_vcn_sw_init(struct amdgpu_device *adev)
> return -EINVAL;
> }
>
> + if (amdgpu_indirect_sram >= 0)
> + adev->vcn.indirect_sram = (bool)amdgpu_indirect_sram;
> +
> hdr = (const struct common_firmware_header *)adev->vcn.fw->data;
> adev->vcn.fw_version = le32_to_cpu(hdr->ucode_version);
>
> --
> 2.39.0
>
<gpiccoli@xxxxxxxxxx> wrote:
>
> Currently the FW loading path perform some checks based on IP model
> and in case it is advertised as supported, the VCN indirect SRAM
> mode is used.
>
> Happens that in case there's any issue on FW and this mode ends-up
> not being properly supported, the driver probe fails [0]. Debugging
> this requires driver rebuilding, so to allow fast debug and experiments,
> add a parameter to force setting indirect SRAM mode to true/false from
> the kernel command-line; parameter default is -1, which doesn't change
> the current driver's behavior.
Probably a question for Leo or James, but I was under the impression
non-indirect mode didn't work on production silicon for most chips,
but maybe I'm mis-remembering that.
Alex
>
> [0] Example of this issue, observed on Steam Deck:
>
> [drm] kiq ring mec 2 pipe 1 q 0
> [drm] failed to load ucode VCN0_RAM(0x3A)
> [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0000)
> amdgpu 0000:04:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring vcn_dec_0 test failed (-110)
> [drm:amdgpu_device_init.cold [amdgpu]] *ERROR* hw_init of IP block <vcn_v3_0> failed -110
> amdgpu 0000:04:00.0: amdgpu: amdgpu_device_ip_init failed
> amdgpu 0000:04:00.0: amdgpu: Fatal error during GPU init
>
> Cc: James Zhu <James.Zhu@xxxxxxx>
> Cc: Lazar Lijo <Lijo.Lazar@xxxxxxx>
> Cc: Leo Liu <leo.liu@xxxxxxx>
> Cc: Mario Limonciello <mario.limonciello@xxxxxxx>
> Cc: Sonny Jiang <sonny.jiang@xxxxxxx>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@xxxxxxxxxx>
> ---
>
>
> This work is based on agd5f/amd-staging-drm-next branch.
> Thanks in advance for reviews/comments!
> Cheers,
>
> Guilherme
>
>
> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 9 +++++++++
> drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 3 +++
> 3 files changed, 13 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 872450a3a164..5d3c92c94f18 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -215,6 +215,7 @@ extern int amdgpu_noretry;
> extern int amdgpu_force_asic_type;
> extern int amdgpu_smartshift_bias;
> extern int amdgpu_use_xgmi_p2p;
> +extern int amdgpu_indirect_sram;
> #ifdef CONFIG_HSA_AMD
> extern int sched_policy;
> extern bool debug_evictions;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 06aba201d4db..c7182c0bc841 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -187,6 +187,7 @@ int amdgpu_num_kcq = -1;
> int amdgpu_smartshift_bias;
> int amdgpu_use_xgmi_p2p = 1;
> int amdgpu_vcnfw_log;
> +int amdgpu_indirect_sram = -1;
>
> static void amdgpu_drv_delayed_reset_work_handler(struct work_struct *work);
>
> @@ -941,6 +942,14 @@ MODULE_PARM_DESC(smu_pptable_id,
> "specify pptable id to be used (-1 = auto(default) value, 0 = use pptable from vbios, > 0 = soft pptable id)");
> module_param_named(smu_pptable_id, amdgpu_smu_pptable_id, int, 0444);
>
> +/**
> + * DOC: indirect_sram (int)
> + * Allow users to force using (or not) the VCN indirect SRAM mode in the fw load
> + * code. Default is -1, meaning auto (aka, don't mess with driver's behavior).
> + */
> +MODULE_PARM_DESC(indirect_sram, "Force VCN indirect SRAM (-1 = auto (default), 0 = disabled, 1 = enabled)");
> +module_param_named(indirect_sram, amdgpu_indirect_sram, int, 0444);
> +
> /* These devices are not supported by amdgpu.
> * They are supported by the mach64, r128, radeon drivers
> */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> index 1f880e162d9d..a2290087e01c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> @@ -137,6 +137,9 @@ int amdgpu_vcn_sw_init(struct amdgpu_device *adev)
> return -EINVAL;
> }
>
> + if (amdgpu_indirect_sram >= 0)
> + adev->vcn.indirect_sram = (bool)amdgpu_indirect_sram;
> +
> hdr = (const struct common_firmware_header *)adev->vcn.fw->data;
> adev->vcn.fw_version = le32_to_cpu(hdr->ucode_version);
>
> --
> 2.39.0
>