> -----Original Message----- > From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of Matthew Auld > Sent: Thursday, January 20, 2022 11:05 PM > To: Paneer Selvam, Arunpravin <Arunpravin.PaneerSelvam@xxxxxxx>; dri-devel@xxxxxxxxxxxxxxxxxxxxx; intel-gfx@xxxxxxxxxxxxxxxxxxxxx; amd-gfx@xxxxxxxxxxxxxxxxxxxxx > Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; tzimmermann@xxxxxxx; jani.nikula@xxxxxxxxxxxxxxx; Koenig, Christian <Christian.Koenig@xxxxxxx>; daniel@xxxxxxxx > Subject: Re: [PATCH v9 4/6] drm: implement a method to free unused pages > > On 19/01/2022 11:37, Arunpravin wrote: >> On contiguous allocation, we round up the size to the *next* power of >> 2, implement a function to free the unused pages after the newly >> allocate block. >> >> v2(Matthew Auld): >> - replace function name 'drm_buddy_free_unused_pages' with >> drm_buddy_block_trim >> - replace input argument name 'actual_size' with 'new_size' >> - add more validation checks for input arguments >> - add overlaps check to avoid needless searching and splitting >> - merged the below patch to see the feature in action >> - add free unused pages support to i915 driver >> - lock drm_buddy_block_trim() function as it calls mark_free/mark_split >> are all globally visible >> >> v3(Matthew Auld): >> - remove trim method error handling as we address the failure case >> at drm_buddy_block_trim() function >> >> v4: >> - in case of trim, at __alloc_range() split_block failure path >> marks the block as free and removes it from the original list, >> potentially also freeing it, to overcome this problem, we turn >> the drm_buddy_block_trim() input node into a temporary node to >> prevent recursively freeing itself, but still retain the >> un-splitting/freeing of the other nodes(Matthew Auld) >> >> - modify the drm_buddy_block_trim() function return type >> >> v5(Matthew Auld): >> - revert drm_buddy_block_trim() function return type changes in v4 >> - modify drm_buddy_block_trim() passing argument n_pages to original_size >> as n_pages has already been rounded up to the next power-of-two and >> passing n_pages results noop >> >> v6: >> - fix warnings reported by kernel test robot <lkp@xxxxxxxxx> >> >> Signed-off-by: Arunpravin <Arunpravin.PaneerSelvam@xxxxxxx> >> --- >> drivers/gpu/drm/drm_buddy.c | 65 +++++++++++++++++++ >> drivers/gpu/drm/i915/i915_ttm_buddy_manager.c | 10 +++ >> include/drm/drm_buddy.h | 4 ++ >> 3 files changed, 79 insertions(+) >> >> diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c >> index 6aa5c1ce25bf..c5902a81b8c5 100644 >> --- a/drivers/gpu/drm/drm_buddy.c >> +++ b/drivers/gpu/drm/drm_buddy.c >> @@ -546,6 +546,71 @@ static int __drm_buddy_alloc_range(struct drm_buddy *mm, >> return __alloc_range(mm, &dfs, start, size, blocks); >> } >> >> +/** >> + * drm_buddy_block_trim - free unused pages >> + * >> + * @mm: DRM buddy manager >> + * @new_size: original size requested >> + * @blocks: output list head to add allocated blocks > > @blocks: Input and output list of allocated blocks. MUST contain single block as input to be trimmed. On success will contain the newly allocated blocks making up the @new_size. Blocks always appear in ascending order. > > ? > >> + * >> + * For contiguous allocation, we round up the size to the nearest >> + * power of two value, drivers consume *actual* size, so remaining >> + * portions are unused and it can be freed. > > so remaining portions are unused and can be optionally freed with this function. > > ? > >> + * >> + * Returns: >> + * 0 on success, error code on failure. >> + */ >> +int drm_buddy_block_trim(struct drm_buddy *mm, >> + u64 new_size, >> + struct list_head *blocks) >> +{ >> + struct drm_buddy_block *parent; >> + struct drm_buddy_block *block; >> + LIST_HEAD(dfs); >> + u64 new_start; >> + int err; >> + >> + if (!list_is_singular(blocks)) >> + return -EINVAL; >> + >> + block = list_first_entry(blocks, >> + struct drm_buddy_block, >> + link); >> + >> + if (!drm_buddy_block_is_allocated(block)) > > Maybe: > > if (WARN_ON(!drm_buddy_block_is_allocated())) > > AFAIK it should be normally impossible to be handed such non-allocated block, and so should be treated as a serious programmer error. > > ? > >> + return -EINVAL; >> + >> + if (new_size > drm_buddy_block_size(mm, block)) >> + return -EINVAL; >> + >> + if (!new_size && !IS_ALIGNED(new_size, mm->chunk_size)) >> + return -EINVAL; > > I assume that's a typo: > > if (!new_size || ...) > > Otherwise I think looks good. Some unit tests for this would be nice, but not a blocker. And this does at least pass the igt_mock_contiguous selftest, and I didn't see anything nasty when running on DG1, which does make use of TTM_PL_FLAG_CONTIGUOUS, Good to hear its running on DG1, all changes are added to v10. working on moving i915 buddy selftests into drm selftest folder, I will add a new unit test case for trim function > Reviewed-by: Matthew Auld <matthew.auld@xxxxxxxxx> > >> + >> + if (new_size == drm_buddy_block_size(mm, block)) >> + return 0; >> + >> + list_del(&block->link); >> + mark_free(mm, block); >> + mm->avail += drm_buddy_block_size(mm, block); >> + >> + /* Prevent recursively freeing this node */ >> + parent = block->parent; >> + block->parent = NULL; >> + >> + new_start = drm_buddy_block_offset(block); >> + list_add(&block->tmp_link, &dfs); >> + err = __alloc_range(mm, &dfs, new_start, new_size, blocks); >> + if (err) { >> + mark_allocated(block); >> + mm->avail -= drm_buddy_block_size(mm, block); >> + list_add(&block->link, blocks); >> + } >> + >> + block->parent = parent; >> + return err; >> +} >> +EXPORT_SYMBOL(drm_buddy_block_trim); >> + >> /** >> * drm_buddy_alloc_blocks - allocate power-of-two blocks >> * >> diff --git a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c >> b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c >> index 3662434b64bb..53eb100688a6 100644 >> --- a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c >> +++ b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c >> @@ -97,6 +97,16 @@ static int i915_ttm_buddy_man_alloc(struct ttm_resource_manager *man, >> if (unlikely(err)) >> goto err_free_blocks; >> >> + if (place->flags & TTM_PL_FLAG_CONTIGUOUS) { >> + u64 original_size = (u64)bman_res->base.num_pages << PAGE_SHIFT; >> + >> + mutex_lock(&bman->lock); >> + drm_buddy_block_trim(mm, >> + original_size, >> + &bman_res->blocks); >> + mutex_unlock(&bman->lock); >> + } >> + >> *res = &bman_res->base; >> return 0; >> >> diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h index >> 424fc443115e..17ca928fce8e 100644 >> --- a/include/drm/drm_buddy.h >> +++ b/include/drm/drm_buddy.h >> @@ -145,6 +145,10 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm, >> struct list_head *blocks, >> unsigned long flags); >> >> +int drm_buddy_block_trim(struct drm_buddy *mm, >> + u64 new_size, >> + struct list_head *blocks); >> + >> void drm_buddy_free_block(struct drm_buddy *mm, struct >> drm_buddy_block *block); >> >> void drm_buddy_free_list(struct drm_buddy *mm, struct list_head >> *objects); >>