On 2024-04-22 10:40, Christian König
wrote:
Am 22.04.24 um 15:57 schrieb Philip Yang:
Define macro MAX_SG_SEGMENT_SIZE 2GB, because struct scatterlist length
is unsigned int, and some users of it cast to a signed int, so every
segment of sg table is limited to size 2GB maximum.
For contiguous VRAM allocation, don't limit the max buddy block size in
order to get contiguous VRAM memory. To workaround the sg table segment
size limit, allocate multiple segments if contiguous size is bigger than
MAX_SG_SEGMENT_SIZE.
Signed-off-by: Philip Yang <Philip.Yang@xxxxxxx>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 4be8b091099a..9fe56a21ef88 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -31,6 +31,8 @@
#include "amdgpu_atomfirmware.h"
#include "atom.h"
+#define MAX_SG_SEGMENT_SIZE (2UL << 30)
+
struct amdgpu_vram_reservation {
u64 start;
u64 size;
@@ -532,8 +534,13 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man,
BUG_ON(min_block_size < mm->chunk_size);
- /* Limit maximum size to 2GiB due to SG table limitations */
- size = min(remaining_size, 2ULL << 30);
+ if (place->flags & TTM_PL_FLAG_CONTIGUOUS)
+ size = remaining_size;
+ else
+ /* Limit maximum size to 2GiB due to SG table limitations
+ * for no contiguous allocation.
+ */
+ size = min(remaining_size, MAX_SG_SEGMENT_SIZE);
Well that doesn't make sense, either fix the creation of the sg tables or limit the segment size. Not both.
yes, right. we don't need limit the segment size for non-contiguous allocation either as this is handled by min_block_size. I will send v4 patch to fix this. Then we could have another patch to remove the while loop, size and remaining size to simply the code in future.
Regards,
Philip
if ((size >= (u64)pages_per_block << PAGE_SHIFT) &&
!(size & (((u64)pages_per_block << PAGE_SHIFT) - 1)))
@@ -675,7 +682,7 @@ int amdgpu_vram_mgr_alloc_sgt(struct amdgpu_device *adev,
amdgpu_res_first(res, offset, length, &cursor);
while (cursor.remaining) {
num_entries++;
- amdgpu_res_next(&cursor, cursor.size);
+ amdgpu_res_next(&cursor, min(cursor.size, MAX_SG_SEGMENT_SIZE));
}
r = sg_alloc_table(*sgt, num_entries, GFP_KERNEL);
@@ -695,7 +702,7 @@ int amdgpu_vram_mgr_alloc_sgt(struct amdgpu_device *adev,
amdgpu_res_first(res, offset, length, &cursor);
for_each_sgtable_sg((*sgt), sg, i) {
phys_addr_t phys = cursor.start + adev->gmc.aper_base;
- size_t size = cursor.size;
+ unsigned long size = min(cursor.size, MAX_SG_SEGMENT_SIZE);
Please keep size_t here or use unsigned int, using unsigned long just looks like trying to hide the problem.
And I wouldn't use a separate define but rather just INT_MAX instead.
Regards,
Christian.
dma_addr_t addr;
addr = dma_map_resource(dev, phys, size, dir,
@@ -708,7 +715,7 @@ int amdgpu_vram_mgr_alloc_sgt(struct amdgpu_device *adev,
sg_dma_address(sg) = addr;
sg_dma_len(sg) = size;
- amdgpu_res_next(&cursor, cursor.size);
+ amdgpu_res_next(&cursor, size);
}
return 0;