On Tue, 2024-12-10 at 17:15 +0100, Nirmoy Das wrote: > Fix a potential GPU page fault during tt -> system moves by waiting > for > migration jobs to complete before unmapping SG. This ensures that > IOMMU > mappings are not prematurely torn down while a migration job is still > in > progress. > > v2: Use intr=false(Matt A) > v3: Update commit message(Matt A) > > Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3466 > Fixes: 75521e8b56e8 ("drm/xe: Perform dma_map when moving system > buffer objects to TT") > Cc: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx> > Cc: Matthew Brost <matthew.brost@xxxxxxxxx> > Cc: Lucas De Marchi <lucas.demarchi@xxxxxxxxx> > Cc: <stable@xxxxxxxxxxxxxxx> # v6.11+ > Cc: Matthew Auld <matthew.auld@xxxxxxxxx> > Signed-off-by: Nirmoy Das <nirmoy.das@xxxxxxxxx> > Reviewed-by: Matthew Auld <matthew.auld@xxxxxxxxx> > --- > drivers/gpu/drm/xe/xe_bo.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c > index 06931df876ab..0a41b6c0583a 100644 > --- a/drivers/gpu/drm/xe/xe_bo.c > +++ b/drivers/gpu/drm/xe/xe_bo.c > @@ -857,8 +857,16 @@ static int xe_bo_move(struct ttm_buffer_object > *ttm_bo, bool evict, > > out: > if ((!ttm_bo->resource || ttm_bo->resource->mem_type == > XE_PL_SYSTEM) && > - ttm_bo->ttm) > + ttm_bo->ttm) { > + long timeout = dma_resv_wait_timeout(ttm_bo- > >base.resv, > + > DMA_RESV_USAGE_BOOKKEEP, > + false, > + > MAX_SCHEDULE_TIMEOUT); > + if (timeout < 0) > + ret = timeout; > + > xe_tt_unmap_sg(ttm_bo->ttm); > + } > > return ret; > } I assume here we're waiting for the move fence, right? However if @evict is true, we should hit the ttm_bo_wait_free_node() path. In what cases do we hit this without evict being true? Also, shouldn't it be sufficient to wait for DMA_RESV_USAGE_KERNEL here? Thanks, Thomas