On 2025-02-06 10:14, Lazar, Lijo wrote:
On 1/29/2025 8:50 PM, Eric Huang wrote:
In some ASICs L2 cache info may miss in kfd topology,
because the first bitmap may be empty, that means
the first cu may be inactive, so to find the first
active cu will solve the issue.
Signed-off-by: Eric Huang <jinhuieric.huang@xxxxxxx>
---
drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index 4936697e6fc2..73d95041a388 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -1665,17 +1665,31 @@ static int fill_in_l2_l3_pcache(struct kfd_cache_properties **props_ext,
int cache_type, unsigned int cu_processor_id,
struct kfd_node *knode)
{
- unsigned int cu_sibling_map_mask;
+ unsigned int cu_sibling_map_mask = 0;
int first_active_cu;
int i, j, k, xcc, start, end;
int num_xcc = NUM_XCC(knode->xcc_mask);
struct kfd_cache_properties *pcache = NULL;
enum amdgpu_memory_partition mode;
struct amdgpu_device *adev = knode->adev;
+ bool found = false;
start = ffs(knode->xcc_mask) - 1;
end = start + num_xcc;
- cu_sibling_map_mask = cu_info->bitmap[start][0][0];
+
+ /* To find the bitmap in the first active cu */
+ for (xcc = start; xcc < end && !found; xcc++) {
It seems there is an assumption made here that a CU in one XCC could
share this cache with CU in another XCC. This is not true for GFX 9.4.3
SOCs. In those, a CU in XCC0 doesn't share L2 with CU in XCC1.
In KFD topology we only report L2 cache info of the first active cu in A
XCC, which could be XCC0 or XCC1. It is generic for L2 info in the
certain XCP/kfd node, and not specific for every XCC, so it doesn't mean
the L2 cache found in XCC0 can be shared with XCC1, it only means there
is L2 cache in this kfd node.
Regards,
Eric
Thanks,
Lijo
+ for (i = 0; i < gfx_info->max_shader_engines && !found; i++) {
+ for (j = 0; j < gfx_info->max_sh_per_se && !found; j++) {
+ if (cu_info->bitmap[xcc][i % 4][j % 4]) {
+ cu_sibling_map_mask =
+ cu_info->bitmap[xcc][i % 4][j % 4];
+ found = true;
+ }
+ }
+ }
+ }
+
cu_sibling_map_mask &=
((1 << pcache_info[cache_type].num_cu_shared) - 1);
first_active_cu = ffs(cu_sibling_map_mask);