Re: [PATCH] drm/buddy: Add start address support to trim function

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 22/07/2024 12:41, Paneer Selvam, Arunpravin wrote:
Hi Matthew,

On 7/19/2024 4:01 PM, Matthew Auld wrote:
On 17/07/2024 16:02, Paneer Selvam, Arunpravin wrote:


On 7/16/2024 3:34 PM, Matthew Auld wrote:
On 16/07/2024 10:50, Paneer Selvam, Arunpravin wrote:
Hi Matthew,

On 7/10/2024 6:20 PM, Matthew Auld wrote:
On 10/07/2024 07:03, Paneer Selvam, Arunpravin wrote:
Thanks Alex.

Hi Matthew,
Any comments?

Do we not pass the required address alignment when allocating the pages in the first place?
If address alignment is really useful, we can add that in the drm_buddy_alloc_blocks() function.

I mean don't we already pass the min page size, which should give us matching physical address alignment?
I think we don't need to align the address to the passed min_block_size value for all the contiguous buffers, so I thought that decision we can leave it to the drivers and they can achieve that through trim function
in this kind of a specific request.

I would have assumed it would be simpler to use min_block_size and then trim the size, if it's too big? That would then also take care of the try_harder case?
For example, if the required contiguous size is 1MiB and min_block_size is 256KiB, to perform the address alignment of 256KiB, we might need to over-allocate at least to the min_block_size (say 256KiB). Now the size becomes 1280KiB and

If we have 1M contig request then it should already be aligned to 256K and every other power-of-two < 1M. VRAM should start at offset zero, so 1M block will have 1M address alignment, and so should also be aligned to 256K, right?

Or does "address alignment of 256KiB" mean something else here? To me it just means IS_ALIGNED(block_start, 256K).

since the contiguous flag is enabled, we will round up the size to the next power of two and the size value becomes 2MiB. Next, in trimming we should round up the block start address to the min_block_size. May be we can keep the above mentioned operations under the flag combination DRM_BUDDY_CONTIGUOUS_ALLOCATION && DRM_BUDDY_ADDRESS_ALIGNMENT?.

At the moment, we cannot support address alignment for try_harder allocations since in case of try_harder allocations we first traverse RHS to allocate the maximum possible and traverse LHS (here we align the LHS size to min_block_size) to allocate the remaining size. May be in case of DRM_BUDDY_ADDRESS_ALIGNMENT, we should first allocate LHS satisfying the address alignment requirement and then traverse RHS to allocate the remaining size if required?

Also how are we dealing with the multi-block try_harder case? AFAICT we only allow trimming single block atm, or is it not possible to trigger that path here? Or are we handling that somehow?
not possible to trigger that path here. only when we either over-allocate the LHS size and pass the multiple blocks to the trim function or implement the above mentioned method.


https://patchwork.freedesktop.org/series/136150/
We are getting this sparse error from the Intel CI. Do you think these errors are introduced with this patches?

I think it's safe to ignore, there seem to be other series with the same thing.
Thanks.


Thanks,
Arun.


Thanks,
Arun.


Thanks,
Arun.

On 7/9/2024 1:42 AM, Alex Deucher wrote:
On Thu, Jul 4, 2024 at 4:40 AM Arunpravin Paneer Selvam
<Arunpravin.PaneerSelvam@xxxxxxx> wrote:
- Add a new start parameter in trim function to specify exact
   address from where to start the trimming. This would help us
   in situations like if drivers would like to do address alignment
   for specific requirements.

- Add a new flag DRM_BUDDY_TRIM_DISABLE. Drivers can use this
   flag to disable the allocator trimming part. This patch enables
   the drivers control trimming and they can do it themselves
   based on the application requirements.

v1:(Matthew)
   - check new_start alignment with min chunk_size
   - use range_overflows()

Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@xxxxxxx>
Series is:
Acked-by: Alex Deucher <alexander.deucher@xxxxxxx>

I'd like to take this series through the amdgpu tree if there are no objections as it's required for display buffers on some chips and I'd
like to make sure it lands in 6.11.

Thanks,

Alex

---
  drivers/gpu/drm/drm_buddy.c          | 25 +++++++++++++++++++++++--
  drivers/gpu/drm/xe/xe_ttm_vram_mgr.c |  2 +-
  include/drm/drm_buddy.h              |  2 ++
  3 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 94f8c34fc293..8cebe1fa4e9d 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -851,6 +851,7 @@ static int __alloc_contig_try_harder(struct drm_buddy *mm,
   * drm_buddy_block_trim - free unused pages
   *
   * @mm: DRM buddy manager
+ * @start: start address to begin the trimming.
   * @new_size: original size requested
   * @blocks: Input and output list of allocated blocks.
   * MUST contain single block as input to be trimmed.
@@ -866,11 +867,13 @@ static int __alloc_contig_try_harder(struct drm_buddy *mm,
   * 0 on success, error code on failure.
   */
  int drm_buddy_block_trim(struct drm_buddy *mm,
+                        u64 *start,
                          u64 new_size,
                          struct list_head *blocks)
  {
         struct drm_buddy_block *parent;
         struct drm_buddy_block *block;
+       u64 block_start, block_end;
         LIST_HEAD(dfs);
         u64 new_start;
         int err;
@@ -882,6 +885,9 @@ int drm_buddy_block_trim(struct drm_buddy *mm,
                                  struct drm_buddy_block,
                                  link);

+       block_start = drm_buddy_block_offset(block);
+       block_end = block_start + drm_buddy_block_size(mm, block);
+
         if (WARN_ON(!drm_buddy_block_is_allocated(block)))
                 return -EINVAL;

@@ -894,6 +900,20 @@ int drm_buddy_block_trim(struct drm_buddy *mm,
         if (new_size == drm_buddy_block_size(mm, block))
                 return 0;

+       new_start = block_start;
+       if (start) {
+               new_start = *start;
+
+               if (new_start < block_start)
+                       return -EINVAL;
+
+               if (!IS_ALIGNED(new_start, mm->chunk_size))
+                       return -EINVAL;
+
+               if (range_overflows(new_start, new_size, block_end))
+                       return -EINVAL;
+       }
+
         list_del(&block->link);
         mark_free(mm, block);
         mm->avail += drm_buddy_block_size(mm, block);
@@ -904,7 +924,6 @@ int drm_buddy_block_trim(struct drm_buddy *mm,
         parent = block->parent;
         block->parent = NULL;

-       new_start = drm_buddy_block_offset(block);
         list_add(&block->tmp_link, &dfs);
         err =  __alloc_range(mm, &dfs, new_start, new_size, blocks, NULL);
         if (err) {
@@ -1066,7 +1085,8 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
         } while (1);

         /* Trim the allocated block to the required size */
-       if (original_size != size) {
+       if (!(flags & DRM_BUDDY_TRIM_DISABLE) &&
+           original_size != size) {
                 struct list_head *trim_list;
                 LIST_HEAD(temp);
                 u64 trim_size;
@@ -1083,6 +1103,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
                 }

                 drm_buddy_block_trim(mm,
+                                    NULL,
                                      trim_size,
                                      trim_list);

diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
index fe3779fdba2c..423b261ea743 100644
--- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
+++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c
@@ -150,7 +150,7 @@ static int xe_ttm_vram_mgr_new(struct ttm_resource_manager *man,
         } while (remaining_size);

         if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
-               if (!drm_buddy_block_trim(mm, vres->base.size, &vres->blocks)) +               if (!drm_buddy_block_trim(mm, NULL, vres->base.size, &vres->blocks))
                         size = vres->base.size;
         }

diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h
index 82570f77e817..0c2f735f0265 100644
--- a/include/drm/drm_buddy.h
+++ b/include/drm/drm_buddy.h
@@ -27,6 +27,7 @@
  #define DRM_BUDDY_CONTIGUOUS_ALLOCATION BIT(2)
  #define DRM_BUDDY_CLEAR_ALLOCATION BIT(3)
  #define DRM_BUDDY_CLEARED BIT(4)
+#define DRM_BUDDY_TRIM_DISABLE BIT(5)

  struct drm_buddy_block {
  #define DRM_BUDDY_HEADER_OFFSET GENMASK_ULL(63, 12)
@@ -155,6 +156,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm,
                            unsigned long flags);

  int drm_buddy_block_trim(struct drm_buddy *mm,
+                        u64 *start,
                          u64 new_size,
                          struct list_head *blocks);

--
2.25.1








[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux