On Thu, 2024-12-05 at 12:40 +0000, Matthew Auld wrote: > On 05/12/2024 12:02, Nirmoy Das wrote: > > There could be still migration job going on while doing > > xe_tt_unmap_sg() which could trigger GPU page faults. Fix this by > > waiting for the migration job to finish. > > > > v2: Use intr=false(Matt A) > > > > Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3466 > > Fixes: 75521e8b56e8 ("drm/xe: Perform dma_map when moving system > > buffer objects to TT") > > Cc: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx> > > Cc: Matthew Brost <matthew.brost@xxxxxxxxx> > > Cc: Lucas De Marchi <lucas.demarchi@xxxxxxxxx> > > Cc: <stable@xxxxxxxxxxxxxxx> # v6.11+ > > Cc: Matthew Auld <matthew.auld@xxxxxxxxx> > > Signed-off-by: Nirmoy Das <nirmoy.das@xxxxxxxxx> > > Ok, so this is something like ttm_bo_move_to_ghost() doing a pipeline > move for tt -> system, but we then do xe_tt_unmap_sg() too early > which > tears down the IOMMU (if enabled) mappings whilst the job is in > progress? > > Maybe add some more info to the commit message? I think this for sure > fixes it. Just wondering if it's somehow possible to keep the mapping > until the job is done, since all tt -> sys moves are now synced here? > > Unless Thomas has a better idea here, Not at the moment. Ideally we should somehow attach the dma-map to the ghost object. Perhaps moving forward we could look at attaching it to the XE_PL_TT resource. Then I believe it will get freed with the ghost object. /Thomas > Reviewed-by: Matthew Auld <matthew.auld@xxxxxxxxx> > > > --- > > drivers/gpu/drm/xe/xe_bo.c | 10 +++++++++- > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/xe/xe_bo.c > > b/drivers/gpu/drm/xe/xe_bo.c > > index b2aa368a23f8..c906a5529db0 100644 > > --- a/drivers/gpu/drm/xe/xe_bo.c > > +++ b/drivers/gpu/drm/xe/xe_bo.c > > @@ -857,8 +857,16 @@ static int xe_bo_move(struct ttm_buffer_object > > *ttm_bo, bool evict, > > > > out: > > if ((!ttm_bo->resource || ttm_bo->resource->mem_type == > > XE_PL_SYSTEM) && > > - ttm_bo->ttm) > > + ttm_bo->ttm) { > > + long timeout = dma_resv_wait_timeout(ttm_bo- > > >base.resv, > > + > > DMA_RESV_USAGE_BOOKKEEP, > > + false, > > + > > MAX_SCHEDULE_TIMEOUT); > > + if (timeout < 0) > > + ret = timeout; > > + > > xe_tt_unmap_sg(ttm_bo->ttm); > > + } > > > > return ret; > > } >