[AMD Official Use Only - General] Comments inline > -----Original Message----- > From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of > Sreekant Somasekharan > Sent: Monday, April 3, 2023 3:59 PM > To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx > Cc: Somasekharan, Sreekant <Sreekant.Somasekharan@xxxxxxx> > Subject: [PATCH] drm/amdkfd: On GFX11 check PCIe atomics support and set > CP_HQD_HQ_STATUS0[29] > > On GFX11, CP_HQD_HQ_STATUS0[29] bit will be used by CPFW to acknowledge > whether PCIe atomics are supported. The default value of this bit is set > to 0. Driver will check whether PCIe atomics are supported and set the > bit to 1 if supported. This will force CPFW to use real atomic ops. > If the bit is not set, CPFW will default to read/modify/write using the > firmware itself. > > This is applicable only to RS64 based GFX11 with MEC FW greater than or > equal to 509. If MEC FW is less than 509, PCIe atomics needs to be > supported, else it will skip the device. > > This commit also involves moving amdgpu_amdkfd_device_probe() function > call after per-IP early_init loop in amdgpu_device_ip_early_init() > function so as to check for RS64 enabled device. > > Signed-off-by: Sreekant Somasekharan <sreekant.somasekharan@xxxxxxx> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- > drivers/gpu/drm/amd/amdkfd/kfd_device.c | 11 +++++++++++ > drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c | 9 +++++++++ > 3 files changed, 21 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 7116119ed038..b3a754ca0923 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -2150,7 +2150,6 @@ static int amdgpu_device_ip_early_init(struct > amdgpu_device *adev) > adev->has_pr3 = parent ? pci_pr3_present(parent) : false; > } > > - amdgpu_amdkfd_device_probe(adev); > > adev->pm.pp_feature = amdgpu_pp_feature_mask; > if (amdgpu_sriov_vf(adev) || sched_policy == > KFD_SCHED_POLICY_NO_HWS) > @@ -2206,6 +2205,7 @@ static int amdgpu_device_ip_early_init(struct > amdgpu_device *adev) > if (!total) > return -ENODEV; > > + amdgpu_amdkfd_device_probe(adev); > adev->cg_flags &= amdgpu_cg_mask; > adev->pg_flags &= amdgpu_pg_mask; > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > index 521dfa88aad8..64a295a35d37 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > @@ -204,6 +204,17 @@ static void kfd_device_info_init(struct kfd_dev *kfd, > /* Navi1x+ */ > if (gc_version >= IP_VERSION(10, 1, 1)) > kfd->device_info.needs_pci_atomics = true; > + } else if (gc_version < IP_VERSION(12, 0, 0)) { What if we get a GFX9 with MEC v509? Wouldn't that trigger this too? Wondering if this should be if (gc_version>=IP_VERSION(11,0,0) && gc_version < IP_VERSION(12,0,0)) thus ensuring it's only GFX11. Or maybe there is some better check than that. Either way, checking that it's < GFX11 might false-positive on GFX10- too, so we should probably be explicit in our GFX check that it's only GFX11. Kent > + /* On GFX11 running on RS64, MEC FW version must be > greater than > + * or equal to version 509 to support acknowledging > whether > + * PCIe atomics are supported. Before MEC version 509, > PCIe > + * atomics are required. After that, the FW's use of > atomics > + * is controlled by CP_HQD_HQ_STATUS0[29]. > + * This will fail on GFX11 when PCIe atomics are not > supported > + * and MEC FW version < 509 for RS64 based CPFW. > + */ > + kfd->device_info.needs_pci_atomics = true; > + kfd->device_info.no_atomic_fw_version = kfd->adev- > >gfx.rs64_enable ? 509 : 0; > } > } else { > kfd->device_info.doorbell_size = 4; > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c > b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c > index 4a9af800b1f1..c5ea594abbf6 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c > @@ -143,6 +143,15 @@ static void init_mqd(struct mqd_manager *mm, void > **mqd, > 1 << CP_HQD_QUANTUM__QUANTUM_SCALE__SHIFT > | > 1 << > CP_HQD_QUANTUM__QUANTUM_DURATION__SHIFT; > > + /* > + * If PCIe atomics are supported, set CP_HQD_HQ_STATUS0[29] == 1 > + * to force CPFW to use atomics. This is supported only on MEC FW > + * version >= 509 and on RS64 based CPFW only. On previous versions, > + * platforms running on GFX11 must support atomics else will skip the > device. > + */ > + if (amdgpu_amdkfd_have_atomics_support((mm->dev->adev))) > + m->cp_hqd_hq_status0 |= 1 << 29; > + > if (q->format == KFD_QUEUE_FORMAT_AQL) { > m->cp_hqd_aql_control = > 1 << CP_HQD_AQL_CONTROL__CONTROL0__SHIFT; > -- > 2.25.1