How about this? We should just move HDP remap to gmc hw_init since that is mainly what uses it anyway. Alex On Tue, Aug 30, 2022 at 2:05 PM Alex Deucher <alexdeucher@xxxxxxxxx> wrote: > > On Tue, Aug 30, 2022 at 12:06 PM Lazar, Lijo <lijo.lazar@xxxxxxx> wrote: > > > > > > > > On 8/30/2022 8:39 PM, Alex Deucher wrote: > > > On Tue, Aug 30, 2022 at 10:45 AM Lazar, Lijo <lijo.lazar@xxxxxxx> wrote: > > >> > > >> > > >> > > >> On 8/30/2022 7:18 PM, Alex Deucher wrote: > > >>> On Tue, Aug 30, 2022 at 12:05 AM Lazar, Lijo <lijo.lazar@xxxxxxx> wrote: > > >>>> > > >>>> > > >>>> > > >>>> On 8/29/2022 10:20 PM, Alex Deucher wrote: > > >>>>> On Mon, Aug 29, 2022 at 4:18 AM Lijo Lazar <lijo.lazar@xxxxxxx> wrote: > > >>>>>> > > >>>>>> HDP flush is used early in the init sequence as part of memory controller > > >>>>>> block initialization. Hence remapping of HDP registers needed for flush > > >>>>>> needs to happen earlier. > > >>>>>> > > >>>>>> This also fixes the Unsupported Request error reported through AER during > > >>>>>> driver load. The error happens as a write happens to the remap offset > > >>>>>> before real remapping is done. > > >>>>>> > > >>>>>> Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D216373&data=05%7C01%7Clijo.lazar%40amd.com%7Cbe8781fe1b0c41d3bad408da8a99b3d8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637974690005710507%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=98WWFEcwi2tzyf%2BxnYC%2FK3UcCE5mfXI00qfYGUpt2Sk%3D&reserved=0 > > >>>>>> > > >>>>>> The error was unnoticed before and got visible because of the commit > > >>>>>> referenced below. This doesn't fix anything in the commit below, rather > > >>>>>> fixes the issue in amdgpu exposed by the commit. The reference is only > > >>>>>> to associate this commit with below one so that both go together. > > >>>>>> > > >>>>>> Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") > > >>>>>> > > >>>>>> Reported-by: Tom Seewald <tseewald@xxxxxxxxx> > > >>>>>> Signed-off-by: Lijo Lazar <lijo.lazar@xxxxxxx> > > >>>>>> Cc: stable@xxxxxxxxxxxxxxx > > >>>>> > > >>>>> How about something like the attached patch rather than these two > > >>>>> patches? It's a bit bigger but seems cleaner and more defensive in my > > >>>>> opinion. > > >>>>> > > >>>> > > >>>> Whenever device goes to suspend/reset and then comes back, remap offset > > >>>> has to be set back to 0 to make sure it doesn't use the wrong offset > > >>>> when the register assumes default values again. > > >>>> > > >>>> To avoid the if-check in hdp_flush (which is more frequent), another way > > >>>> is to initialize the remap offset to default offset during early init > > >>>> and hw fini/suspend sequences. It won't be obvious (even with this > > >>>> patch) as to when remap offset vs default offset is used though. > > >>> > > >>> On resume, the common IP is resumed first so it will always be set. > > >>> The only case that is a problem is init because we init GMC out of > > >>> order. We could init common before GMC in amdgpu_device_ip_init(). I > > >>> think that should be fine, but I wasn't sure if there might be some > > >>> fallout from that on certain cards. > > >>> > > >> > > >> There are other places where an IP order is forced like in > > >> amdgpu_device_ip_reinit_early_sriov(). This also may not affect this > > >> case as remapping is not done for VF. > > >> > > >> Agree that a better way is to have the common IP to be inited first. > > > > > > How about these patches? > > > > > > > Looks good to me. BTW, is nbio 7.7 for an APU (in which case hdp flush > > is not expected to be used)? > > It would be used in some cases, e.g., GPU VM passthrough where we use > the BAR rather than the carve out. > > Alex > > > > > > Thanks, > > Lijo > > > > > Alex > > > > > > > > >> > > >> Thanks, > > >> Lijo > > >> > > >>> Alex > > >>> > > >>>> > > >>>> Thanks, > > >>>> Lijo > > >>>> > > >>>>> Alex > > >>>>> > > >>>>>> --- > > >>>>>> v2: > > >>>>>> Take care of IP resume cases (Alex Deucher) > > >>>>>> Add NULL check to nbio.funcs to cover older (GFXv8) ASICs (Felix Kuehling) > > >>>>>> Add more details in commit message and associate with AER patch (Bjorn > > >>>>>> Helgaas) > > >>>>>> > > >>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 24 ++++++++++++++++++++++ > > >>>>>> drivers/gpu/drm/amd/amdgpu/nv.c | 6 ------ > > >>>>>> drivers/gpu/drm/amd/amdgpu/soc15.c | 6 ------ > > >>>>>> drivers/gpu/drm/amd/amdgpu/soc21.c | 6 ------ > > >>>>>> 4 files changed, 24 insertions(+), 18 deletions(-) > > >>>>>> > > >>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > >>>>>> index ce7d117efdb5..e420118769a5 100644 > > >>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > >>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > >>>>>> @@ -2334,6 +2334,26 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev) > > >>>>>> return 0; > > >>>>>> } > > >>>>>> > > >>>>>> +/** > > >>>>>> + * amdgpu_device_prepare_ip - prepare IPs for hardware initialization > > >>>>>> + * > > >>>>>> + * @adev: amdgpu_device pointer > > >>>>>> + * > > >>>>>> + * Any common hardware initialization sequence that needs to be done before > > >>>>>> + * hw init of individual IPs is performed here. This is different from the > > >>>>>> + * 'common block' which initializes a set of IPs. > > >>>>>> + */ > > >>>>>> +static void amdgpu_device_prepare_ip(struct amdgpu_device *adev) > > >>>>>> +{ > > >>>>>> + /* Remap HDP registers to a hole in mmio space, for the purpose > > >>>>>> + * of exposing those registers to process space. This needs to be > > >>>>>> + * done before hw init of ip blocks to take care of HDP flush > > >>>>>> + * operations through registers during hw_init. > > >>>>>> + */ > > >>>>>> + if (adev->nbio.funcs && adev->nbio.funcs->remap_hdp_registers && > > >>>>>> + !amdgpu_sriov_vf(adev)) > > >>>>>> + adev->nbio.funcs->remap_hdp_registers(adev); > > >>>>>> +} > > >>>>>> > > >>>>>> /** > > >>>>>> * amdgpu_device_ip_init - run init for hardware IPs > > >>>>>> @@ -2376,6 +2396,8 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev) > > >>>>>> DRM_ERROR("amdgpu_vram_scratch_init failed %d\n", r); > > >>>>>> goto init_failed; > > >>>>>> } > > >>>>>> + > > >>>>>> + amdgpu_device_prepare_ip(adev); > > >>>>>> r = adev->ip_blocks[i].version->funcs->hw_init((void *)adev); > > >>>>>> if (r) { > > >>>>>> DRM_ERROR("hw_init %d failed %d\n", i, r); > > >>>>>> @@ -3058,6 +3080,7 @@ static int amdgpu_device_ip_reinit_early_sriov(struct amdgpu_device *adev) > > >>>>>> AMD_IP_BLOCK_TYPE_IH, > > >>>>>> }; > > >>>>>> > > >>>>>> + amdgpu_device_prepare_ip(adev); > > >>>>>> for (i = 0; i < adev->num_ip_blocks; i++) { > > >>>>>> int j; > > >>>>>> struct amdgpu_ip_block *block; > > >>>>>> @@ -3139,6 +3162,7 @@ static int amdgpu_device_ip_resume_phase1(struct amdgpu_device *adev) > > >>>>>> { > > >>>>>> int i, r; > > >>>>>> > > >>>>>> + amdgpu_device_prepare_ip(adev); > > >>>>>> for (i = 0; i < adev->num_ip_blocks; i++) { > > >>>>>> if (!adev->ip_blocks[i].status.valid || adev->ip_blocks[i].status.hw) > > >>>>>> continue; > > >>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c > > >>>>>> index b3fba8dea63c..3ac7fef74277 100644 > > >>>>>> --- a/drivers/gpu/drm/amd/amdgpu/nv.c > > >>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/nv.c > > >>>>>> @@ -1032,12 +1032,6 @@ static int nv_common_hw_init(void *handle) > > >>>>>> nv_program_aspm(adev); > > >>>>>> /* setup nbio registers */ > > >>>>>> adev->nbio.funcs->init_registers(adev); > > >>>>>> - /* remap HDP registers to a hole in mmio space, > > >>>>>> - * for the purpose of expose those registers > > >>>>>> - * to process space > > >>>>>> - */ > > >>>>>> - if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev)) > > >>>>>> - adev->nbio.funcs->remap_hdp_registers(adev); > > >>>>>> /* enable the doorbell aperture */ > > >>>>>> nv_enable_doorbell_aperture(adev, true); > > >>>>>> > > >>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c > > >>>>>> index fde6154f2009..a0481e37d7cf 100644 > > >>>>>> --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > > >>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > > >>>>>> @@ -1240,12 +1240,6 @@ static int soc15_common_hw_init(void *handle) > > >>>>>> soc15_program_aspm(adev); > > >>>>>> /* setup nbio registers */ > > >>>>>> adev->nbio.funcs->init_registers(adev); > > >>>>>> - /* remap HDP registers to a hole in mmio space, > > >>>>>> - * for the purpose of expose those registers > > >>>>>> - * to process space > > >>>>>> - */ > > >>>>>> - if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev)) > > >>>>>> - adev->nbio.funcs->remap_hdp_registers(adev); > > >>>>>> > > >>>>>> /* enable the doorbell aperture */ > > >>>>>> soc15_enable_doorbell_aperture(adev, true); > > >>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b/drivers/gpu/drm/amd/amdgpu/soc21.c > > >>>>>> index 55284b24f113..16b447055102 100644 > > >>>>>> --- a/drivers/gpu/drm/amd/amdgpu/soc21.c > > >>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/soc21.c > > >>>>>> @@ -660,12 +660,6 @@ static int soc21_common_hw_init(void *handle) > > >>>>>> soc21_program_aspm(adev); > > >>>>>> /* setup nbio registers */ > > >>>>>> adev->nbio.funcs->init_registers(adev); > > >>>>>> - /* remap HDP registers to a hole in mmio space, > > >>>>>> - * for the purpose of expose those registers > > >>>>>> - * to process space > > >>>>>> - */ > > >>>>>> - if (adev->nbio.funcs->remap_hdp_registers) > > >>>>>> - adev->nbio.funcs->remap_hdp_registers(adev); > > >>>>>> /* enable the doorbell aperture */ > > >>>>>> soc21_enable_doorbell_aperture(adev, true); > > >>>>>> > > >>>>>> -- > > >>>>>> 2.25.1 > > >>>>>>
From ba71fb1b78340a91468fab44008af24ca548f38d Mon Sep 17 00:00:00 2001 From: Alex Deucher <alexander.deucher@xxxxxxx> Date: Thu, 1 Sep 2022 15:32:17 -0400 Subject: [PATCH 1/3] drm/amdgpu: init HDP remap as part of GMC hw_init for gmc9, 10 GMC is the primary user of this so do it as part of GMC init rather than common init. This way the remap is always applied before it's used. This fixes the Unsupported Request error reported through AER during driver load. The error happens as a write happens to the remap offset before real remapping is done. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373 The error was unnoticed before and got visible because of the commit referenced below. This doesn't fix anything in the commit below, rather fixes the issue in amdgpu exposed by the commit. The reference is only to associate this commit with below one so that both go together. Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx> --- drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 7 +++++++ drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 7 +++++++ drivers/gpu/drm/amd/amdgpu/nv.c | 6 ------ drivers/gpu/drm/amd/amdgpu/soc15.c | 7 ------- 4 files changed, 14 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c index f513e2c2e964..6f16852ea7c6 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c @@ -1101,6 +1101,13 @@ static int gmc_v10_0_hw_init(void *handle) if (adev->gfxhub.funcs && adev->gfxhub.funcs->utcl2_harvest) adev->gfxhub.funcs->utcl2_harvest(adev); + /* remap HDP registers to a hole in mmio space, + * for the purpose of expose those registers + * to process space + */ + if (adev->nbio.funcs->remap_hdp_registers) + adev->nbio.funcs->remap_hdp_registers(adev); + r = gmc_v10_0_gart_enable(adev); if (r) return r; diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index 4603653916f5..31e9e8159aba 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c @@ -1832,6 +1832,13 @@ static int gmc_v9_0_hw_init(void *handle) if (adev->mmhub.funcs->update_power_gating) adev->mmhub.funcs->update_power_gating(adev, true); + /* remap HDP registers to a hole in mmio space, + * for the purpose of expose those registers + * to process space + */ + if (adev->nbio.funcs->remap_hdp_registers) + adev->nbio.funcs->remap_hdp_registers(adev); + adev->hdp.funcs->init_registers(adev); /* After HDP is initialized, flush HDP.*/ diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c index a2339cbedd32..b97c7cc72349 100644 --- a/drivers/gpu/drm/amd/amdgpu/nv.c +++ b/drivers/gpu/drm/amd/amdgpu/nv.c @@ -1027,12 +1027,6 @@ static int nv_common_hw_init(void *handle) nv_program_aspm(adev); /* setup nbio registers */ adev->nbio.funcs->init_registers(adev); - /* remap HDP registers to a hole in mmio space, - * for the purpose of expose those registers - * to process space - */ - if (adev->nbio.funcs->remap_hdp_registers) - adev->nbio.funcs->remap_hdp_registers(adev); /* enable the doorbell aperture */ nv_enable_doorbell_aperture(adev, true); diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c index 603b01db69e2..b1d8e6767d41 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc15.c +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c @@ -1235,13 +1235,6 @@ static int soc15_common_hw_init(void *handle) soc15_program_aspm(adev); /* setup nbio registers */ adev->nbio.funcs->init_registers(adev); - /* remap HDP registers to a hole in mmio space, - * for the purpose of expose those registers - * to process space - */ - if (adev->nbio.funcs->remap_hdp_registers) - adev->nbio.funcs->remap_hdp_registers(adev); - /* enable the doorbell aperture */ soc15_enable_doorbell_aperture(adev, true); /* HW doorbell routing policy: doorbell writing not -- 2.37.2
From 2b2fa1488ba8932cb7606343c81e2c59b9d86633 Mon Sep 17 00:00:00 2001 From: Alex Deucher <alexander.deucher@xxxxxxx> Date: Tue, 30 Aug 2022 11:08:09 -0400 Subject: [PATCH 3/3] drm/amdgpu: add HDP remap functionality to nbio 7.7 Was missing before and would have resulted in a write to a non-existant register. Normally APUs don't use HDP, but other asics could use this code and APUs do use the HDP when used in passthrough. Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx> --- drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c b/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c index 1dc95ef21da6..21eac06bf1ed 100644 --- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c @@ -28,6 +28,14 @@ #include "nbio/nbio_7_7_0_sh_mask.h" #include <uapi/linux/kfd_ioctl.h> +static void nbio_v7_7_remap_hdp_registers(struct amdgpu_device *adev) +{ + WREG32_SOC15(NBIO, 0, regBIF_BX0_REMAP_HDP_MEM_FLUSH_CNTL, + adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL); + WREG32_SOC15(NBIO, 0, regBIF_BX0_REMAP_HDP_REG_FLUSH_CNTL, + adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_REG_FLUSH_CNTL); +} + static u32 nbio_v7_7_get_rev_id(struct amdgpu_device *adev) { u32 tmp; @@ -342,4 +350,5 @@ const struct amdgpu_nbio_funcs nbio_v7_7_funcs = { .get_clockgating_state = nbio_v7_7_get_clockgating_state, .ih_control = nbio_v7_7_ih_control, .init_registers = nbio_v7_7_init_registers, + .remap_hdp_registers = nbio_v7_7_remap_hdp_registers, }; -- 2.37.2
From 3396c4a45c16cd1593d90f40a455ee30818cbf7c Mon Sep 17 00:00:00 2001 From: Alex Deucher <alexander.deucher@xxxxxxx> Date: Thu, 1 Sep 2022 15:36:29 -0400 Subject: [PATCH 2/3] drm/amdgpu: init HDP remap as part of GMC hw_init for gmc11 GMC is the primary user of this so do it as part of GMC init rather than common init. This way the remap is always applied before it's used. This fixes the Unsupported Request error reported through AER during driver load. The error happens as a write happens to the remap offset before real remapping is done. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373 The error was unnoticed before and got visible because of the commit referenced below. This doesn't fix anything in the commit below, rather fixes the issue in amdgpu exposed by the commit. The reference is only to associate this commit with below one so that both go together. Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx> --- drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 7 +++++++ drivers/gpu/drm/amd/amdgpu/soc21.c | 6 ------ 2 files changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c index 846ccb6cf07d..ae907bbe70d7 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c @@ -894,6 +894,13 @@ static int gmc_v11_0_hw_init(void *handle) /* The sequence of these two function calls matters.*/ gmc_v11_0_init_golden_registers(adev); + /* remap HDP registers to a hole in mmio space, + * for the purpose of expose those registers + * to process space + */ + if (adev->nbio.funcs->remap_hdp_registers) + adev->nbio.funcs->remap_hdp_registers(adev); + r = gmc_v11_0_gart_enable(adev); if (r) return r; diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b/drivers/gpu/drm/amd/amdgpu/soc21.c index a9d5a784cc48..648cd1d64fd7 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc21.c +++ b/drivers/gpu/drm/amd/amdgpu/soc21.c @@ -674,12 +674,6 @@ static int soc21_common_hw_init(void *handle) soc21_program_aspm(adev); /* setup nbio registers */ adev->nbio.funcs->init_registers(adev); - /* remap HDP registers to a hole in mmio space, - * for the purpose of expose those registers - * to process space - */ - if (adev->nbio.funcs->remap_hdp_registers) - adev->nbio.funcs->remap_hdp_registers(adev); /* enable the doorbell aperture */ soc21_enable_doorbell_aperture(adev, true); -- 2.37.2