Re: [PATCH] drm/amdgpu: Fix UVD contiguous CS mapping problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 14.11.24 um 16:38 schrieb Paneer Selvam, Arunpravin:

Hi Christian,

On 11/11/2024 3:33 PM, Christian König wrote:
Am 11.11.24 um 09:05 schrieb Arunpravin Paneer Selvam:
When starting the mpv player, Radeon R9 users are observing
the below error in dmesg.

[drm:amdgpu_uvd_cs_pass2 [amdgpu]]
*ERROR* msg/fb buffer ff00f7c000-ff00f7e000 out of 256MB segment!

The patch tries to set the TTM_PL_FLAG_CONTIGUOUS for both user
flag(AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS) set and not set cases.

Closes:https://gitlab.freedesktop.org/drm/amd/-/issues/3599
Closes:https://gitlab.freedesktop.org/drm/amd/-/issues/3501
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@xxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 18 +++++++++++-------
  1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index d891ab779ca7..9f73f821054b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1801,13 +1801,17 @@ int amdgpu_cs_find_mapping(struct amdgpu_cs_parser *parser,       if (dma_resv_locking_ctx((*bo)->tbo.base.resv) != &parser->exec.ticket)
          return -EINVAL;
  -    (*bo)->flags |= AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
-    amdgpu_bo_placement_from_domain(*bo, (*bo)->allowed_domains);
-    for (i = 0; i < (*bo)->placement.num_placement; i++)
-        (*bo)->placements[i].flags |= TTM_PL_FLAG_CONTIGUOUS;
-    r = ttm_bo_validate(&(*bo)->tbo, &(*bo)->placement, &ctx);
-    if (r)
-        return r;
+    if ((*bo)->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS) {
+        (*bo)->placements[0].flags |= TTM_PL_FLAG_CONTIGUOUS;

That is a pretty clearly broken approach. (*bo)->placements[0].flags is just used temporary between the call to amdgpu_bo_placement_from_domain() and ttm_bo_validate().

So setting the TTM_PL_FLAG_CONTIGUOUS here is certainly not correct. Why is that necessary?
gitlab users reported that the buffers are out of 256MB segment, looks like buffers are not contiguous, after making the contiguous allocation mandatory using the TTM_PL_FLAG_CONTIGUOUS flag, they are not seeing this issue.

In that case we have a bug somewhere that we don't properly initialize the flags before validating something.

I fear you need to investigate further since that here clearly doesn't fix the bug but just hides it.

Regards,
Christian.


Thanks,
Arun.

Regards,
Christian.

+    } else {
+        (*bo)->flags |= AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
+        amdgpu_bo_placement_from_domain(*bo, (*bo)->allowed_domains);
+        for (i = 0; i < (*bo)->placement.num_placement; i++)
+            (*bo)->placements[i].flags |= TTM_PL_FLAG_CONTIGUOUS;
+        r = ttm_bo_validate(&(*bo)->tbo, &(*bo)->placement, &ctx);
+        if (r)
+            return r;
+    }
        return amdgpu_ttm_alloc_gart(&(*bo)->tbo);
  }







[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux