Re: [PATCH 1/2] drm/amdgpu: remove acc_size from reserve/unreserve mem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2022-05-18 17:31, Felix Kuehling wrote:
On 2022-05-18 13:55, philip yang wrote:


On 2022-05-17 19:11, Alex Sierra wrote:
TTM used to track the "acc_size" of all BOs internally. We needed to
keep track of it in our memory reservation to avoid TTM running out
of memory in its own accounting. However, that "acc_size" accounting
has since been removed from TTM. Therefore we don't really need to
track it any more.

acc_size is size of amdgpu_bo data structure plus size of pages array and dma_address array, it is needed for each BO, so should track as system_mem_needed. It can be removed from ttm_mem_needed as this is not allocated by TTM as GTT memory.

You have a point, I didn't think of that. The fact that TTM isn't tracking the data structure sizes any more doesn't mean, we shouldn't account for it in our own system memory usage.

That said, do we actually have DMA address arrays for VRAM allocations?

Also, acc_size doesn't track the extra dmabuf BOs we create for DMA mappings on multiple GPUs. So I'm not sure how useful the acc_size tracking is at this point. The system memory limit is currently 15/16 of total memory. Maybe that leaves enough reserve for data structure sizes?
Based on the calculation, set 15/16 free system memory limit is enough to prevent OOM killer, as acc_size is up to 1/64 of memory size on 8GPU system, so it is safe to simplify and remove data structure acc_size.

One nit-pick below. This patch is Reviewed-by: Philip Yang <philip.yang@xxxxxxx>


Regards,
  Felix


Regards,

Philip

Signed-off-by: Alex Sierra<alex.sierra@xxxxxxx>
---
  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 57 ++++++-------------
  1 file changed, 16 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index fada3b149361..e985cf9c7ec0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -108,17 +108,8 @@ void amdgpu_amdkfd_reserve_system_mem(uint64_t size)
   * compromise that should work in most cases without reserving too
   * much memory for page tables unnecessarily (factor 16K, >> 14).
   */
-#define ESTIMATE_PT_SIZE(mem_size) ((mem_size) >> 14)
-
-static size_t amdgpu_amdkfd_acc_size(uint64_t size)
-{
-    size >>= PAGE_SHIFT;
-    size *= sizeof(dma_addr_t) + sizeof(void *);
  -    return __roundup_pow_of_two(sizeof(struct amdgpu_bo)) +
-        __roundup_pow_of_two(sizeof(struct ttm_tt)) +
-        PAGE_ALIGN(size);
-}
+#define ESTIMATE_PT_SIZE(mem_size) ((mem_size) >> 14)
    /**
   * amdgpu_amdkfd_reserve_mem_limit() - Decrease available memory by size

Remove "including any reserved for control structures" from function description.

Regards,

Philip

@@ -136,28 +127,22 @@ static int amdgpu_amdkfd_reserve_mem_limit(struct amdgpu_device *adev,
  {
      uint64_t reserved_for_pt =
          ESTIMATE_PT_SIZE(amdgpu_amdkfd_total_mem_size);
-    size_t acc_size, system_mem_needed, ttm_mem_needed, vram_needed;
+    size_t system_mem_needed, ttm_mem_needed, vram_needed;
      int ret = 0;
  -    acc_size = amdgpu_amdkfd_acc_size(size);
-
+    system_mem_needed = 0;
+    ttm_mem_needed = 0;
      vram_needed = 0;
      if (alloc_flag & KFD_IOC_ALLOC_MEM_FLAGS_GTT) {
-        system_mem_needed = acc_size + size;
-        ttm_mem_needed = acc_size + size;
+        system_mem_needed = size;
+        ttm_mem_needed = size;
      } else if (alloc_flag & KFD_IOC_ALLOC_MEM_FLAGS_VRAM) {
-        system_mem_needed = acc_size;
-        ttm_mem_needed = acc_size;
          vram_needed = size;
      } else if (alloc_flag & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) {
-        system_mem_needed = acc_size + size;
-        ttm_mem_needed = acc_size;
-    } else if (alloc_flag &
-           (KFD_IOC_ALLOC_MEM_FLAGS_DOORBELL |
-            KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP)) {
-        system_mem_needed = acc_size;
-        ttm_mem_needed = acc_size;
-    } else {
+        system_mem_needed = size;
+    } else if (!(alloc_flag &
+                (KFD_IOC_ALLOC_MEM_FLAGS_DOORBELL |
+                 KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP))) {
          pr_err("%s: Invalid BO type %#x\n", __func__, alloc_flag);
          return -ENOMEM;
      }
@@ -193,28 +178,18 @@ static int amdgpu_amdkfd_reserve_mem_limit(struct amdgpu_device *adev,
  static void unreserve_mem_limit(struct amdgpu_device *adev,
          uint64_t size, u32 alloc_flag)
  {
-    size_t acc_size;
-
-    acc_size = amdgpu_amdkfd_acc_size(size);
-
      spin_lock(&kfd_mem_limit.mem_limit_lock);
        if (alloc_flag & KFD_IOC_ALLOC_MEM_FLAGS_GTT) {
-        kfd_mem_limit.system_mem_used -= (acc_size + size);
-        kfd_mem_limit.ttm_mem_used -= (acc_size + size);
+        kfd_mem_limit.system_mem_used -= size;
+        kfd_mem_limit.ttm_mem_used -= size;
      } else if (alloc_flag & KFD_IOC_ALLOC_MEM_FLAGS_VRAM) {
-        kfd_mem_limit.system_mem_used -= acc_size;
-        kfd_mem_limit.ttm_mem_used -= acc_size;
          adev->kfd.vram_used -= size;
      } else if (alloc_flag & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) {
-        kfd_mem_limit.system_mem_used -= (acc_size + size);
-        kfd_mem_limit.ttm_mem_used -= acc_size;
-    } else if (alloc_flag &
-           (KFD_IOC_ALLOC_MEM_FLAGS_DOORBELL |
-            KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP)) {
-        kfd_mem_limit.system_mem_used -= acc_size;
-        kfd_mem_limit.ttm_mem_used -= acc_size;
-    } else {
+        kfd_mem_limit.system_mem_used -= size;
+    } else if (!(alloc_flag &
+                (KFD_IOC_ALLOC_MEM_FLAGS_DOORBELL |
+                 KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP))) {
          pr_err("%s: Invalid BO type %#x\n", __func__, alloc_flag);
          goto release;
      }

[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux