Re: [PATCH 10/10] drm/amdgpu: stop removing BOs from the LRU v3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Leaving BOs on the LRU is harmless. We always did this for VM page table and per VM BOs.

The key point is that BOs which couldn't be reserved can't be evicted. So what happened is that an application used basically all of VRAM during CS and because of this X server couldn't pin a BO for scanout.

Now we keep the BOs on the LRU and modify TTM to block for the CS to complete, which in turn allows the X server to pin its BO for scanout.

Christian.

Am 22.05.19 um 21:43 schrieb Kuehling, Felix:
Can you explain how this avoids OOM situations? When is it safe to leave
a reserved BO on the LRU list? Could we do the same thing in
amdgpu_amdkfd_gpuvm.c? And if we did, what would be the expected side
effects or consequences?

Thanks,
    Felix

On 2019-05-22 8:59 a.m., Christian König wrote:
[CAUTION: External Email]

This avoids OOM situations when we have lots of threads
submitting at the same time.

v3: apply this to the whole driver, not just CS

Signed-off-by: Christian König <christian.koenig@xxxxxxx>
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     | 2 +-
   drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c    | 2 +-
   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c    | 4 ++--
   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 2 +-
   4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 20f2955d2a55..3e2da24cd17a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
          }

          r = ttm_eu_reserve_buffers(&p->ticket, &p->validated, true,
-                                  &duplicates, true);
+                                  &duplicates, false);
          if (unlikely(r != 0)) {
                  if (r != -ERESTARTSYS)
                          DRM_ERROR("ttm_eu_reserve_buffers failed.\n");
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
index 06f83cac0d3a..f660628e6af9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
@@ -79,7 +79,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm,
          list_add(&csa_tv.head, &list);
          amdgpu_vm_get_pd_bo(vm, &list, &pd);

-       r = ttm_eu_reserve_buffers(&ticket, &list, true, NULL, true);
+       r = ttm_eu_reserve_buffers(&ticket, &list, true, NULL, false);
          if (r) {
                  DRM_ERROR("failed to reserve CSA,PD BOs: err=%d\n", r);
                  return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index d513a5ad03dd..ed25a4e14404 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -171,7 +171,7 @@ void amdgpu_gem_object_close(struct drm_gem_object *obj,

          amdgpu_vm_get_pd_bo(vm, &list, &vm_pd);

-       r = ttm_eu_reserve_buffers(&ticket, &list, false, &duplicates, true);
+       r = ttm_eu_reserve_buffers(&ticket, &list, false, &duplicates, false);
          if (r) {
                  dev_err(adev->dev, "leaking bo va because "
                          "we fail to reserve bo (%d)\n", r);
@@ -608,7 +608,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,

          amdgpu_vm_get_pd_bo(&fpriv->vm, &list, &vm_pd);

-       r = ttm_eu_reserve_buffers(&ticket, &list, true, &duplicates, true);
+       r = ttm_eu_reserve_buffers(&ticket, &list, true, &duplicates, false);
          if (r)
                  goto error_unref;

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index c430e8259038..d60593cc436e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -155,7 +155,7 @@ static inline int amdgpu_bo_reserve(struct amdgpu_bo *bo, bool no_intr)
          struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
          int r;

-       r = ttm_bo_reserve(&bo->tbo, !no_intr, false, NULL);
+       r = __ttm_bo_reserve(&bo->tbo, !no_intr, false, NULL);
          if (unlikely(r != 0)) {
                  if (r != -ERESTARTSYS)
                          dev_err(adev->dev, "%p reserve failed\n", bo);
--
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux