Am 23.10.24 um 09:38 schrieb Tvrtko Ursulin:
On 22/10/2024 17:24, Christian König wrote:
Am 22.10.24 um 17:17 schrieb Li, Yunxiang (Teddy):
[Public]
+static uint32_t fold_memtype(uint32_t memtype) {
In general please add prefixes to even static functions, e.g.
amdgpu_vm_ or
amdgpu_bo_.
+ /* Squash private placements into 'cpu' to keep the legacy
userspace view.
*/
+ switch (mem_type) {
+ case TTM_PL_VRAM:
+ case TTM_PL_TT:
+ return memtype
+ default:
+ return TTM_PL_SYSTEM;
+ }
+}
+
+static uint32_t bo_get_memtype(struct amdgpu_bo *bo) {
That whole function belongs into amdgpu_bo.c
Do you mean bo_get_memtype or fold_memtype? I debated whether
bo_get_memtype should go into amdgpu_vm.c or amdgpu_bo.c, and since
it's using fold_memtype and only useful for memory stats because of
folding the private placements I just left them here together with
the other mem stats code.
I can move it to amdgpu_bo.c make it return the memtype verbatim and
just fold it when I do the accounting.
I think that folding GDS, GWS and OA into system is also a bug. We
should really not doing that.
Just wanted to point out for this round that the code to query the
current placement from a BO should probably go into amdgpu_bo.c and
not amdgpu_vm.c
+ struct ttm_resource *res = bo->tbo.resource;
+ const uint32_t domain_to_pl[] = {
+ [ilog2(AMDGPU_GEM_DOMAIN_CPU)] = TTM_PL_SYSTEM,
+ [ilog2(AMDGPU_GEM_DOMAIN_GTT)] = TTM_PL_TT,
+ [ilog2(AMDGPU_GEM_DOMAIN_VRAM)] = TTM_PL_VRAM,
+ [ilog2(AMDGPU_GEM_DOMAIN_GDS)] = AMDGPU_PL_GDS,
+ [ilog2(AMDGPU_GEM_DOMAIN_GWS)] = AMDGPU_PL_GWS,
+ [ilog2(AMDGPU_GEM_DOMAIN_OA)] = AMDGPU_PL_OA,
+ [ilog2(AMDGPU_GEM_DOMAIN_DOORBELL)] =
AMDGPU_PL_DOORBELL,
+ };
+ uint32_t domain;
+
+ if (res)
+ return fold_memtype(res->mem_type);
+
+ /*
+ * If no backing store use one of the preferred domain for basic
+ * stats. We take the MSB since that should give a reasonable
+ * view.
+ */
+ BUILD_BUG_ON(TTM_PL_VRAM < TTM_PL_TT || TTM_PL_VRAM <
TTM_PL_SYSTEM);
+ domain = fls(bo->preferred_domains & AMDGPU_GEM_DOMAIN_MASK);
+ if (drm_WARN_ON_ONCE(&adev->ddev,
+ domain == 0 || --domain >=
ARRAY_SIZE(domain_to_pl)))
It's perfectly legal to create a BO without a placement. That one
just won't have a
backing store.
This is lifted from the previous change I'm rebasing onto. I think
what it’s trying to do is if the BO doesn't have a placement, use
the "biggest" (VRAM > TT > SYSTEM) preferred placement for the
purpose of accounting. Previously we just ignore BOs that doesn't
have a placement. I guess there's argument for going with either
approaches.
I was not arguing, I'm simply pointing out a bug. It's perfectly
valid for bo->preferred_domains to be 0.
So the following WARN_ON() that no bit is set is incorrect.
+ return 0;
+ return fold_memtype(domain_to_pl[domain])
That would need specular execution mitigation if I'm not completely
mistaken.
Better use a switch/case statement.
Do you mean change the array indexing to a switch statement?
Yes.
Did you mean array_index_nospec?
Yes.
Domain is not a direct userspace input and is calculated from the mask
which sanitized to allowed values prior to this call. So I *think*
switch is an overkill but don't mind it either. Just commenting FWIW.
I missed that the mask is applied.
Thinking more about it I'm not sure if we should do this conversion in
the first place. IIRC Tvrtko you once suggested a patch which switched a
bunch of code to use the TTM placement instead of the UAPI flags.
Going more into this direction I think when we want to look at the
current placement we should probably also use the TTM PL enumeration
directly.
Regards,
Christian.
Regards,
Tvrtko