On Mon, Jan 16, 2023 at 4:21 PM Guilherme G. Piccoli <gpiccoli@xxxxxxxxxx> wrote: > > Currently the FW loading path perform some checks based on IP model > and in case it is advertised as supported, the VCN indirect SRAM > mode is used. > > Happens that in case there's any issue on FW and this mode ends-up > not being properly supported, the driver probe fails [0]. Debugging > this requires driver rebuilding, so to allow fast debug and experiments, > add a parameter to force setting indirect SRAM mode to true/false from > the kernel command-line; parameter default is -1, which doesn't change > the current driver's behavior. Probably a question for Leo or James, but I was under the impression non-indirect mode didn't work on production silicon for most chips, but maybe I'm mis-remembering that. Alex > > [0] Example of this issue, observed on Steam Deck: > > [drm] kiq ring mec 2 pipe 1 q 0 > [drm] failed to load ucode VCN0_RAM(0x3A) > [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0000) > amdgpu 0000:04:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring vcn_dec_0 test failed (-110) > [drm:amdgpu_device_init.cold [amdgpu]] *ERROR* hw_init of IP block <vcn_v3_0> failed -110 > amdgpu 0000:04:00.0: amdgpu: amdgpu_device_ip_init failed > amdgpu 0000:04:00.0: amdgpu: Fatal error during GPU init > > Cc: James Zhu <James.Zhu@xxxxxxx> > Cc: Lazar Lijo <Lijo.Lazar@xxxxxxx> > Cc: Leo Liu <leo.liu@xxxxxxx> > Cc: Mario Limonciello <mario.limonciello@xxxxxxx> > Cc: Sonny Jiang <sonny.jiang@xxxxxxx> > Signed-off-by: Guilherme G. Piccoli <gpiccoli@xxxxxxxxxx> > --- > > > This work is based on agd5f/amd-staging-drm-next branch. > Thanks in advance for reviews/comments! > Cheers, > > Guilherme > > > drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 9 +++++++++ > drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 3 +++ > 3 files changed, 13 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > index 872450a3a164..5d3c92c94f18 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > @@ -215,6 +215,7 @@ extern int amdgpu_noretry; > extern int amdgpu_force_asic_type; > extern int amdgpu_smartshift_bias; > extern int amdgpu_use_xgmi_p2p; > +extern int amdgpu_indirect_sram; > #ifdef CONFIG_HSA_AMD > extern int sched_policy; > extern bool debug_evictions; > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > index 06aba201d4db..c7182c0bc841 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > @@ -187,6 +187,7 @@ int amdgpu_num_kcq = -1; > int amdgpu_smartshift_bias; > int amdgpu_use_xgmi_p2p = 1; > int amdgpu_vcnfw_log; > +int amdgpu_indirect_sram = -1; > > static void amdgpu_drv_delayed_reset_work_handler(struct work_struct *work); > > @@ -941,6 +942,14 @@ MODULE_PARM_DESC(smu_pptable_id, > "specify pptable id to be used (-1 = auto(default) value, 0 = use pptable from vbios, > 0 = soft pptable id)"); > module_param_named(smu_pptable_id, amdgpu_smu_pptable_id, int, 0444); > > +/** > + * DOC: indirect_sram (int) > + * Allow users to force using (or not) the VCN indirect SRAM mode in the fw load > + * code. Default is -1, meaning auto (aka, don't mess with driver's behavior). > + */ > +MODULE_PARM_DESC(indirect_sram, "Force VCN indirect SRAM (-1 = auto (default), 0 = disabled, 1 = enabled)"); > +module_param_named(indirect_sram, amdgpu_indirect_sram, int, 0444); > + > /* These devices are not supported by amdgpu. > * They are supported by the mach64, r128, radeon drivers > */ > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c > index 1f880e162d9d..a2290087e01c 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c > @@ -137,6 +137,9 @@ int amdgpu_vcn_sw_init(struct amdgpu_device *adev) > return -EINVAL; > } > > + if (amdgpu_indirect_sram >= 0) > + adev->vcn.indirect_sram = (bool)amdgpu_indirect_sram; > + > hdr = (const struct common_firmware_header *)adev->vcn.fw->data; > adev->vcn.fw_version = le32_to_cpu(hdr->ucode_version); > > -- > 2.39.0 >