[Public] > -----Original Message----- > From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of > Russell, Kent > Sent: Tuesday, April 4, 2023 9:43 AM > To: Somasekharan, Sreekant <Sreekant.Somasekharan@xxxxxxx>; amd- > gfx@xxxxxxxxxxxxxxxxxxxxx > Cc: Somasekharan, Sreekant <Sreekant.Somasekharan@xxxxxxx> > Subject: RE: [PATCH] drm/amdkfd: On GFX11 check PCIe atomics support and > set CP_HQD_HQ_STATUS0[29] > > Caution: This message originated from an External Source. Use proper > caution when opening attachments, clicking links, or responding. > > > [AMD Official Use Only - General] > > Comments inline > > > -----Original Message----- > > From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of > > Sreekant Somasekharan > > Sent: Monday, April 3, 2023 3:59 PM > > To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx > > Cc: Somasekharan, Sreekant <Sreekant.Somasekharan@xxxxxxx> > > Subject: [PATCH] drm/amdkfd: On GFX11 check PCIe atomics support and > > set CP_HQD_HQ_STATUS0[29] > > > > On GFX11, CP_HQD_HQ_STATUS0[29] bit will be used by CPFW to > > acknowledge whether PCIe atomics are supported. The default value of > > this bit is set to 0. Driver will check whether PCIe atomics are > > supported and set the bit to 1 if supported. This will force CPFW to use real > atomic ops. > > If the bit is not set, CPFW will default to read/modify/write using > > the firmware itself. > > > > This is applicable only to RS64 based GFX11 with MEC FW greater than > > or equal to 509. If MEC FW is less than 509, PCIe atomics needs to be > > supported, else it will skip the device. > > > > This commit also involves moving amdgpu_amdkfd_device_probe() > function > > call after per-IP early_init loop in amdgpu_device_ip_early_init() > > function so as to check for RS64 enabled device. > > > > Signed-off-by: Sreekant Somasekharan > <sreekant.somasekharan@xxxxxxx> > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- > > drivers/gpu/drm/amd/amdkfd/kfd_device.c | 11 +++++++++++ > > drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c | 9 +++++++++ > > 3 files changed, 21 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > index 7116119ed038..b3a754ca0923 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > @@ -2150,7 +2150,6 @@ static int amdgpu_device_ip_early_init(struct > > amdgpu_device *adev) > > adev->has_pr3 = parent ? pci_pr3_present(parent) : false; > > } > > > > - amdgpu_amdkfd_device_probe(adev); > > > > adev->pm.pp_feature = amdgpu_pp_feature_mask; > > if (amdgpu_sriov_vf(adev) || sched_policy == > > KFD_SCHED_POLICY_NO_HWS) > > @@ -2206,6 +2205,7 @@ static int amdgpu_device_ip_early_init(struct > > amdgpu_device *adev) > > if (!total) > > return -ENODEV; > > > > + amdgpu_amdkfd_device_probe(adev); > > adev->cg_flags &= amdgpu_cg_mask; > > adev->pg_flags &= amdgpu_pg_mask; > > > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > > b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > > index 521dfa88aad8..64a295a35d37 100644 > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > > @@ -204,6 +204,17 @@ static void kfd_device_info_init(struct kfd_dev > *kfd, > > /* Navi1x+ */ > > if (gc_version >= IP_VERSION(10, 1, 1)) > > kfd->device_info.needs_pci_atomics = > > true; > > + } else if (gc_version < IP_VERSION(12, 0, 0)) { > > > What if we get a GFX9 with MEC v509? Wouldn't that trigger this too? > Wondering if this should be if (gc_version>=IP_VERSION(11,0,0) && > gc_version < IP_VERSION(12,0,0)) thus ensuring it's only GFX11. Or maybe > there is some better check than that. Either way, checking that it's < GFX11 > might false-positive on GFX10- too, so we should probably be explicit in our > GFX check that it's only GFX11. The previous condition is for gc_version < IP_VERSION(11, 0, 0), so that condition will (and currently is) taken for gfx9/gfx10/etc. That's to say the logic after this change will look like: If (KFD_IS_SOC15(kfd)) { <...> If (gc_version < IP_VERSION(11, 0, 0)) { <...> } else if (gc_version < IP_VERSION(12, 0, 0)) { <...> } } So this new path will only be taken for gfx11. Best, Graham > > Kent > > > + /* On GFX11 running on RS64, MEC FW version must > > + be > > greater than > > + * or equal to version 509 to support > > + acknowledging > > whether > > + * PCIe atomics are supported. Before MEC > > + version 509, > > PCIe > > + * atomics are required. After that, the FW's > > + use of > > atomics > > + * is controlled by CP_HQD_HQ_STATUS0[29]. > > + * This will fail on GFX11 when PCIe atomics are > > + not > > supported > > + * and MEC FW version < 509 for RS64 based CPFW. > > + */ > > + kfd->device_info.needs_pci_atomics = true; > > + kfd->device_info.no_atomic_fw_version = > > + kfd->adev- > > >gfx.rs64_enable ? 509 : 0; > > } > > } else { > > kfd->device_info.doorbell_size = 4; diff --git > > a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c > > b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c > > index 4a9af800b1f1..c5ea594abbf6 100644 > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c > > @@ -143,6 +143,15 @@ static void init_mqd(struct mqd_manager *mm, > void > > **mqd, > > 1 << CP_HQD_QUANTUM__QUANTUM_SCALE__SHIFT > > | > > 1 << > > CP_HQD_QUANTUM__QUANTUM_DURATION__SHIFT; > > > > + /* > > + * If PCIe atomics are supported, set CP_HQD_HQ_STATUS0[29] == 1 > > + * to force CPFW to use atomics. This is supported only on MEC FW > > + * version >= 509 and on RS64 based CPFW only. On previous versions, > > + * platforms running on GFX11 must support atomics else will > > + skip the > > device. > > + */ > > + if (amdgpu_amdkfd_have_atomics_support((mm->dev->adev))) > > + m->cp_hqd_hq_status0 |= 1 << 29; > > + > > if (q->format == KFD_QUEUE_FORMAT_AQL) { > > m->cp_hqd_aql_control = > > 1 << CP_HQD_AQL_CONTROL__CONTROL0__SHIFT; > > -- > > 2.25.1