Re: [PATCH v5 4/4] drm/amdgpu: track bo memory stats at runtime

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 22/10/2024 17:24, Christian König wrote:
Am 22.10.24 um 17:17 schrieb Li, Yunxiang (Teddy):
[Public]

+static uint32_t fold_memtype(uint32_t memtype) {
In general please add prefixes to even static functions, e.g. amdgpu_vm_ or
amdgpu_bo_.

+   /* Squash private placements into 'cpu' to keep the legacy userspace view.
*/
+   switch (mem_type) {
+   case TTM_PL_VRAM:
+   case TTM_PL_TT:
+           return memtype
+   default:
+           return TTM_PL_SYSTEM;
+   }
+}
+
+static uint32_t bo_get_memtype(struct amdgpu_bo *bo) {
That whole function belongs into amdgpu_bo.c
Do you mean bo_get_memtype or fold_memtype? I debated whether bo_get_memtype should go into amdgpu_vm.c or amdgpu_bo.c, and since it's using fold_memtype and only useful for memory stats because of folding the private placements I just left them here together with the other mem stats code.

I can move it to amdgpu_bo.c make it return the memtype verbatim and just fold it when I do the accounting.

I think that folding GDS, GWS and OA into system is also a bug. We should really not doing that.

Just wanted to point out for this round that the code to query the current placement from a BO should probably go into amdgpu_bo.c and not amdgpu_vm.c


+   struct ttm_resource *res = bo->tbo.resource;
+   const uint32_t domain_to_pl[] = {
+           [ilog2(AMDGPU_GEM_DOMAIN_CPU)]      = TTM_PL_SYSTEM,
+           [ilog2(AMDGPU_GEM_DOMAIN_GTT)]      = TTM_PL_TT,
+           [ilog2(AMDGPU_GEM_DOMAIN_VRAM)]     = TTM_PL_VRAM,
+           [ilog2(AMDGPU_GEM_DOMAIN_GDS)]      = AMDGPU_PL_GDS,
+           [ilog2(AMDGPU_GEM_DOMAIN_GWS)]      = AMDGPU_PL_GWS,
+           [ilog2(AMDGPU_GEM_DOMAIN_OA)]       = AMDGPU_PL_OA,
+           [ilog2(AMDGPU_GEM_DOMAIN_DOORBELL)] =
AMDGPU_PL_DOORBELL,
+   };
+   uint32_t domain;
+
+   if (res)
+           return fold_memtype(res->mem_type);
+
+   /*
+    * If no backing store use one of the preferred domain for basic
+    * stats. We take the MSB since that should give a reasonable
+    * view.
+    */
+   BUILD_BUG_ON(TTM_PL_VRAM < TTM_PL_TT || TTM_PL_VRAM <
TTM_PL_SYSTEM);
+   domain = fls(bo->preferred_domains & AMDGPU_GEM_DOMAIN_MASK);
+   if (drm_WARN_ON_ONCE(&adev->ddev,
+                        domain == 0 || --domain >= ARRAY_SIZE(domain_to_pl)))
It's perfectly legal to create a BO without a placement. That one just won't have a
backing store.

This is lifted from the previous change I'm rebasing onto. I think what it’s trying to do is if the BO doesn't have a placement, use the "biggest" (VRAM > TT > SYSTEM) preferred placement for the purpose of accounting. Previously we just ignore BOs that doesn't have a placement. I guess there's argument for going with either approaches.

I was not arguing, I'm simply pointing out a bug. It's perfectly valid for bo->preferred_domains to be 0.

So the following WARN_ON() that no bit is set is incorrect.


+           return 0;
+   return fold_memtype(domain_to_pl[domain])
That would need specular execution mitigation if I'm not completely mistaken.

Better use a switch/case statement.

Do you mean change the array indexing to a switch statement?

Yes.

Did you mean array_index_nospec? Domain is not a direct userspace input and is calculated from the mask which sanitized to allowed values prior to this call. So I *think* switch is an overkill but don't mind it either. Just commenting FWIW.

Regards,

Tvrtko



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux