On Wed, 2024-12-11 at 13:42 -0800, Matthew Brost wrote: > On Mon, Dec 02, 2024 at 11:44:47AM +0100, Thomas Hellström wrote: > > On Tue, 2024-10-15 at 20:25 -0700, Matthew Brost wrote: > > > Add XE_BO_FLAG_SYSTEM_ALLOC to indicate BO is tied to SVM range. > > > > > > Add XE_BO_FLAG_SKIP_CLEAR to indicate BO does not need to > > > cleared. > > > > > > v2: > > > - Take VM ref for system allocator BOs > > > > > > Signed-off-by: Matthew Brost <matthew.brost@xxxxxxxxx> > > > --- > > > drivers/gpu/drm/xe/xe_bo.c | 15 +++++++++------ > > > drivers/gpu/drm/xe/xe_bo.h | 2 ++ > > > 2 files changed, 11 insertions(+), 6 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/xe/xe_bo.c > > > b/drivers/gpu/drm/xe/xe_bo.c > > > index a02d63e322ae..dbd03383878e 100644 > > > --- a/drivers/gpu/drm/xe/xe_bo.c > > > +++ b/drivers/gpu/drm/xe/xe_bo.c > > > @@ -685,8 +685,9 @@ static int xe_bo_move(struct > > > ttm_buffer_object > > > *ttm_bo, bool evict, > > > move_lacks_source = !old_mem || (handle_system_ccs ? > > > (!bo- > > > > ccs_cleared) : > > > > > > (!mem_type_is_vram(old_mem_type) && !tt_has_data)); > > > > > > - needs_clear = (ttm && ttm->page_flags & > > > TTM_TT_FLAG_ZERO_ALLOC) || > > > - (!ttm && ttm_bo->type == ttm_bo_type_device); > > > + needs_clear = !(bo->flags & XE_BO_FLAG_SKIP_CLEAR) && > > > + ((ttm && ttm->page_flags & > > > TTM_TT_FLAG_ZERO_ALLOC) > > > > > > > > + (!ttm && ttm_bo->type == ttm_bo_type_device)); > > > > It should be worth adding a note about how clearing for svm bos is > > intended to work. From what I can tell, there is an option to clear > > on > > migration from system to vram if no system pages are present? > > > > Sure can add a comment. The migration from system to vram doesn't do > a > clear currently because when 'check_pages' is set we only migrate CPU > faulted in pages. If we remove that, then yes we'd need a clear on > migration. > > > > > > > if (new_mem->mem_type == XE_PL_TT) { > > > ret = xe_tt_map_sg(ttm); > > > @@ -1145,7 +1146,7 @@ static void xe_ttm_bo_destroy(struct > > > ttm_buffer_object *ttm_bo) > > > xe_drm_client_remove_bo(bo); > > > #endif > > > > > > - if (bo->vm && xe_bo_is_user(bo)) > > > + if (bo->vm && (xe_bo_is_user(bo) || bo->flags & > > > XE_BO_FLAG_SYSTEM_ALLOC)) > > > xe_vm_put(bo->vm); > > > > > > mutex_lock(&xe->mem_access.vram_userfault.lock); > > > @@ -1301,7 +1302,8 @@ struct xe_bo *___xe_bo_create_locked(struct > > > xe_device *xe, struct xe_bo *bo, > > > int err; > > > > > > /* Only kernel objects should set GT */ > > > - xe_assert(xe, !tile || type == ttm_bo_type_kernel); > > > + xe_assert(xe, !tile || type == ttm_bo_type_kernel || > > > + flags & XE_BO_FLAG_SYSTEM_ALLOC); > > > > > > if (XE_WARN_ON(!size)) { > > > xe_bo_free(bo); > > > @@ -1493,7 +1495,7 @@ __xe_bo_create_locked(struct xe_device *xe, > > > * by having all the vm's bo refereferences released at > > > vm > > > close > > > * time. > > > */ > > > - if (vm && xe_bo_is_user(bo)) > > > + if (vm && (xe_bo_is_user(bo) || bo->flags & > > > XE_BO_FLAG_SYSTEM_ALLOC)) > > > xe_vm_get(vm); > > > bo->vm = vm; > > > > > > @@ -2333,7 +2335,8 @@ bool xe_bo_needs_ccs_pages(struct xe_bo > > > *bo) > > > * can't be used since there's no CCS storage associated > > > with > > > * non-VRAM addresses. > > > */ > > > - if (IS_DGFX(xe) && (bo->flags & XE_BO_FLAG_SYSTEM)) > > > + if (IS_DGFX(xe) && ((bo->flags & XE_BO_FLAG_SYSTEM) || > > > + (bo->flags & XE_BO_FLAG_SYSTEM_ALLOC))) > > > return false; > > > > Can we support CCS with system allocator? Perhaps add a TODO > > comment if > > so. I figure it should be possible if we resolve on migration to > > system, which we do on BMG. > > > > Honestly don't really understand how CCS works, so unsure if > possible. > Can add a TODO comment and we can circle back. Sounds good. We should probably also discuss with UMD if they see a performance use-case. /Thomas > > Matt > > > > > Thanks, > > Thomas > > > > > > > > > > return true; > > > diff --git a/drivers/gpu/drm/xe/xe_bo.h > > > b/drivers/gpu/drm/xe/xe_bo.h > > > index 7fa44a0138b0..caf0459d16ad 100644 > > > --- a/drivers/gpu/drm/xe/xe_bo.h > > > +++ b/drivers/gpu/drm/xe/xe_bo.h > > > @@ -39,6 +39,8 @@ > > > #define XE_BO_FLAG_NEEDS_64K BIT(15) > > > #define XE_BO_FLAG_NEEDS_2M BIT(16) > > > #define XE_BO_FLAG_GGTT_INVALIDATE BIT(17) > > > +#define XE_BO_FLAG_SYSTEM_ALLOC BIT(18) > > > +#define XE_BO_FLAG_SKIP_CLEAR BIT(19) > > > /* this one is trigger internally only */ > > > #define XE_BO_FLAG_INTERNAL_TEST BIT(30) > > > #define XE_BO_FLAG_INTERNAL_64K BIT(31) > >