On 12/13/2022 12:30 PM, Christian König wrote:
Am 13.12.22 um 00:44 schrieb Luben Tuikov:
On 2022-12-12 14:19, Christian König wrote:
Am 12.12.22 um 18:48 schrieb Luben Tuikov:
Fix amdgpu_bo_validate_size() to check whether the TTM domain
manager for the
requested memory exists, and to allow for non-exclusive domain
allocations, as
there would be if the domain is a mask, e.g. AMDGPU_GEM_DOMAIN_VRAM |
AMDGPU_GEM_DOMAIN_GTT.
Cc: Alex Deucher <Alexander.Deucher@xxxxxxx>
Cc: Christian König <christian.koenig@xxxxxxx>
Signed-off-by: Luben Tuikov <luben.tuikov@xxxxxxx>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 19 +++++++------------
1 file changed, 7 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index fd3ab4b5e5bb1f..e0f103f0ec2178 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -448,31 +448,26 @@ static bool amdgpu_bo_validate_size(struct
amdgpu_device *adev,
/*
* If GTT is part of requested domains the check must succeed to
- * allow fall back to GTT
+ * allow fall back to GTT.
+ *
+ * Note that allocations can request from either domain. For
+ * this reason, check either in non-exclusive way, and if
+ * neither satisfies, fail the validation.
That's not correct, the original logic was completely intentional.
If both VRAM and GTT are specified it's valid if the size fits only into
GTT.
Given that this patch fixes a kernel oops, should this patch then fail
the validation,
i.e. return false?
It should be sufficient if a BO fits into the GTT domain for size
validation. If we haven't initialized the GTT domain and end up here we
should probably just ignore it.
This would then fail, in amdgpu_ttm_reserve_tmr():
ret = amdgpu_bo_create_kernel_at(adev,
adev->gmc.real_vram_size - adev->mman.discovery_tmr_size,
adev->mman.discovery_tmr_size,
AMDGPU_GEM_DOMAIN_VRAM |
AMDGPU_GEM_DOMAIN_GTT,
As I said before using amdgpu_bo_create_kernel_at() with VRAM|GTT
doesn't make any sense at all. We should probably drop the domain
parameter altogether.
What is the alternative planned to prevent usage of VRAM at fixed offsets?
BTW, AMDGPU_GEM_DOMAIN_GTT for above doesn't make any sense. Discovery
region is always in VRAM domain.
Thanks,
Lijo
Regards,
Christian.
&adev->mman.discovery_memory,
NULL);
Regards,
Luben