Re: [PATCH] drm/amdkfd: Page aligned VRAM reserve size

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2023-01-09 22:14, Felix Kuehling wrote:
Am 2023-01-09 um 19:01 schrieb Philip Yang:
Use page aligned size to reserve VRAM usage because page aligned TTM BO
size is used to unreserve VRAM usage, otherwise this cause vram_used
accounting unbalanced.

Change vram_used definition type to int64_t to be able to trigger
WARN_ONCE(adev && adev->kfd.vram_used < 0, "..."), to help debug the
accouting issue with warning and backtrace.

Signed-off-by: Philip Yang <Philip.Yang@xxxxxxx>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h       | 2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index fb41869e357a..333780491867 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -97,7 +97,7 @@ struct amdgpu_amdkfd_fence {
    struct amdgpu_kfd_dev {
      struct kfd_dev *dev;
-    uint64_t vram_used;
+    int64_t vram_used;
      uint64_t vram_used_aligned;
      bool init_complete;
      struct work_struct reset_work;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 2a118669d0e3..f23d94e57762 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -151,7 +151,7 @@ int amdgpu_amdkfd_reserve_mem_limit(struct amdgpu_device *adev,
           * to avoid fragmentation caused by 4K allocations in the tail
           * 2M BO chunk.
           */
-        vram_needed = size;
+        vram_needed = PAGE_ALIGN(size);

This only solves part of the problem. size is used in other places in this function that should all use the page-aligned size. I think we should do the page-alignment at a much higher level, in kfd_ioctl_alloc_memory_of_gpu. That way all the kernel code can safely assume that buffer sizes are page aligned, and we avoid future surprises.

yes, the error handling unreserve should use aligned_size too, and size is also used as number of pages in amdgpu_bo_create for DOMAIN_GWS etc, we can not pass aligned size at higher level, I will send v2 patch for review.

Regards,

Philip


Regards,
  Felix


      } else if (alloc_flag & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) {
          system_mem_needed = size;
      } else if (!(alloc_flag &



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux