On 2022-11-25 05:21, Christian König wrote:
We already fallback to a dummy BO with no backing store when we
allocate GDS,GWS and OA resources and to GTT when we allocate VRAM.
Drop all those workarounds and generalize this for GTT as well. This
fixes ENOMEM issues with runaway applications which try to allocate/free
GTT in a loop and are otherwise only limited by the CPU speed.
The CS will wait for the cleanup of freed up BOs to satisfy the
various domain specific limits and so effectively throttle those
buggy applications down to a sane allocation behavior again.
Signed-off-by: Christian König <christian.koenig@xxxxxxx>
This patch causes some regressions in KFDTest. KFDMemoryTest.MMBench
sees a huge VRAM allocation slow-down. And
KFDMemoryTest.LargestVramBufferTest can only allocate half the available
memory.
This seems to be caused by initially validating VRAM BOs in the CPU
domain, which allocates a ttm_tt. A subsequent validation in the VRAM
domain involves a copy from GTT to VRAM.
After that, freeing of BOs can get delayed by the ghost object of a
previous migration, which delays calling release notifiers and causes
problems for KFDs available memory accounting.
I experimented with a workaround that validates BOs immediately after
allocation, but that only moves around the delays and doesn't solve the
problem. During those experiments I may also have stumbled over a bug in
ttm_buffer_object_transfer: It calls ttm_bo_set_bulk_move before
initializing and locking fbo->base.base._resv. This results in a flood
of warnings because ttm_bo_set_bulk_move expects the reservation to be
locked.
Right now I'd like to remove the bp.domain = initial_domain |
AMDGPU_GEM_DOMAIN_CPU change in amdgpu_gem_object_create to fix this.
Regards,
Felix
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 16 +++-------------
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 6 +-----
2 files changed, 4 insertions(+), 18 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index a0780a4e3e61..62e98f1ad770 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -113,7 +113,7 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned long size,
bp.resv = resv;
bp.preferred_domain = initial_domain;
bp.flags = flags;
- bp.domain = initial_domain;
+ bp.domain = initial_domain | AMDGPU_GEM_DOMAIN_CPU;
bp.bo_ptr_size = sizeof(struct amdgpu_bo);
r = amdgpu_bo_create_user(adev, &bp, &ubo);
@@ -332,20 +332,10 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void *data,
}
initial_domain = (u32)(0xffffffff & args->in.domains);
-retry:
r = amdgpu_gem_object_create(adev, size, args->in.alignment,
- initial_domain,
- flags, ttm_bo_type_device, resv, &gobj);
+ initial_domain, flags, ttm_bo_type_device,
+ resv, &gobj);
if (r && r != -ERESTARTSYS) {
- if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) {
- flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
- goto retry;
- }
-
- if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {
- initial_domain |= AMDGPU_GEM_DOMAIN_GTT;
- goto retry;
- }
DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, %d)\n",
size, initial_domain, args->in.alignment, r);
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 974e85d8b6cc..919bbea2e3ac 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -581,11 +581,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
bo->flags |= AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE;
bo->tbo.bdev = &adev->mman.bdev;
- if (bp->domain & (AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA |
- AMDGPU_GEM_DOMAIN_GDS))
- amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_CPU);
- else
- amdgpu_bo_placement_from_domain(bo, bp->domain);
+ amdgpu_bo_placement_from_domain(bo, bp->domain);
if (bp->type == ttm_bo_type_kernel)
bo->tbo.priority = 1;