[Public] > -----Original Message----- > From: Guilherme G. Piccoli <gpiccoli@xxxxxxxxxx> > Sent: Tuesday, April 18, 2023 6:15 PM > To: stable@xxxxxxxxxxxxxxx > Cc: gregkh@xxxxxxxxxxxxxxxxxxx; sashal@xxxxxxxxxx; amd- > gfx@xxxxxxxxxxxxxxxxxxxxx; Deucher, Alexander > <Alexander.Deucher@xxxxxxx>; Zhu, James <James.Zhu@xxxxxxx>; Liu, > Leo <Leo.Liu@xxxxxxx>; kernel@xxxxxxxxxxxx; kernel-dev@xxxxxxxxxx > Subject: [PATCH 6.1.y] drm/amdgpu/vcn: Disable indirect SRAM on Vangogh > broken BIOSes > > commit 542a56e8eb4467ae654eefab31ff194569db39cd upstream. > > The VCN firmware loading path enables the indirect SRAM mode if it's > advertised as supported. We might have some cases of FW issues that > prevents this mode to working properly though, ending-up in a failed probe. > An example below, observed in the Steam Deck: > > [...] > [drm] failed to load ucode VCN0_RAM(0x3A) [drm] psp gfx command > LOAD_IP_FW(0x6) failed and response status is (0xFFFF0000) amdgpu > 0000:04:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring > vcn_dec_0 test failed (-110) [drm:amdgpu_device_init.cold [amdgpu]] > *ERROR* hw_init of IP block <vcn_v3_0> failed -110 amdgpu 0000:04:00.0: > amdgpu: amdgpu_device_ip_init failed amdgpu 0000:04:00.0: amdgpu: Fatal > error during GPU init [...] > > Disabling the VCN block circumvents this, but it's a very invasive workaround > that turns off the entire feature. So, let's add a quirk on VCN loading that > checks for known problematic BIOSes on Vangogh, so we can proactively > disable the indirect SRAM mode and allow the HW proper probe and VCN IP > block to work fine. > > Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2385 > Fixes: 82132ecc5432 ("drm/amdgpu: enable Vangogh VCN indirect sram > mode") > Fixes: 9a8cc8cabc1e ("drm/amdgpu: enable Vangogh VCN indirect sram > mode") > Cc: stable@xxxxxxxxxxxxxxx > Cc: James Zhu <James.Zhu@xxxxxxx> > Cc: Leo Liu <leo.liu@xxxxxxx> > Signed-off-by: Guilherme G. Piccoli <gpiccoli@xxxxxxxxxx> > Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx> > --- > > Hi folks, this was build/boot tested on Deck. I've also adjusted the context, > function was reworked on 6.2. > > But what a surprise was for me not see this fix already in 6.1.y, since I've > CCed stable, and the reason for that is really peculiar: > > > $ git log -1 --pretty="%an <%ae>: %s" 82132ecc5432 Leo Liu > <leo.liu@xxxxxxx>: drm/amdgpu: enable Vangogh VCN indirect sram mode > > $ git describe --contains 82132ecc5432 > v6.2-rc1~124^2~1^2~13 > > > $ git log -1 --pretty="%an <%ae>: %s" 9a8cc8cabc1e Leo Liu > <leo.liu@xxxxxxx>: drm/amdgpu: enable Vangogh VCN indirect sram mode > > $ git describe --contains 9a8cc8cabc1e > v6.1-rc8~16^2^2 > > > This is quite strange for me, we have 2 commit hashes pointing to the > *same* commit, and each one is present..in a different release !!?! > Since I've marked this patch as fixing 82132ecc5432 originally, 6.1.y stable > misses it, since it only contains 9a8cc8cabc1e (which is the same patch!). > > Alex, do you have an idea why sometimes commits from the AMD tree > appear duplicate in mainline? Specially in different releases, this could cause > some confusion I guess. This is pretty common in the drm. The problem is there is not a good way to get bug fixes into both -next and -fixes without constant back merges. So the patches end up in -next and if they are important for stable, they go to -fixes too. We don't want -next to be broken, but we can't wait until the next kernel cycle to get the bug fixes into the current kernel and stable. I don't know of a good way to make this smoother. Alex > > Thanks in advance, > > > Guilherme > > > drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c > index ce64ca1c6e66..5c1193dd7d88 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c > @@ -26,6 +26,7 @@ > > #include <linux/firmware.h> > #include <linux/module.h> > +#include <linux/dmi.h> > #include <linux/pci.h> > #include <linux/debugfs.h> > #include <drm/drm_drv.h> > @@ -84,6 +85,7 @@ int amdgpu_vcn_sw_init(struct amdgpu_device *adev) > { > unsigned long bo_size; > const char *fw_name; > + const char *bios_ver; > const struct common_firmware_header *hdr; > unsigned char fw_check; > unsigned int fw_shared_size, log_offset; @@ -159,6 +161,21 @@ int > amdgpu_vcn_sw_init(struct amdgpu_device *adev) > if ((adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) > && > (adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG)) > adev->vcn.indirect_sram = true; > + /* > + * Some Steam Deck's BIOS versions are incompatible with > the > + * indirect SRAM mode, leading to amdgpu being unable to > get > + * properly probed (and even potentially crashing the > kernel). > + * Hence, check for these versions here - notice this is > + * restricted to Vangogh (Deck's APU). > + */ > + bios_ver = dmi_get_system_info(DMI_BIOS_VERSION); > + > + if (bios_ver && (!strncmp("F7A0113", bios_ver, 7) || > + !strncmp("F7A0114", bios_ver, 7))) { > + adev->vcn.indirect_sram = false; > + dev_info(adev->dev, > + "Steam Deck quirk: indirect SRAM disabled on > BIOS %s\n", bios_ver); > + } > break; > case IP_VERSION(3, 0, 16): > fw_name = FIRMWARE_DIMGREY_CAVEFISH; > -- > 2.40.0