On Tue, 14 Sept 2021 at 10:03, Christian König <christian.koenig@xxxxxxx> wrote: > > Am 14.09.21 um 10:50 schrieb Matthew Auld: > > Add new flag to indicate special shmem based tt, which can directly > > handle swapping itself, and should be visible to some shrinker. > > > > As part of this we should skip the ttm_pages_allocated accounting, since > > such tt objects should already be reachable, and potentially reclaimable > > by some shrinker, if under memory pressure, and so shouldn't directly > > count towards the swap "watermark" level. > > > > We also need to stop touching the page->mapping and page->index for such > > objects, like in ttm_tt_add_mapping, since shmem already uses these. > > Some drivers seems to depend on the tt mapping/index behaviour for their > > own purposes, so directly using shmem tt likely won't be usable there > > as-is. > > > > Signed-off-by: Matthew Auld <matthew.auld@xxxxxxxxx> > > Cc: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx> > > Cc: Christian König <christian.koenig@xxxxxxx> > > --- > > drivers/gpu/drm/ttm/ttm_bo_vm.c | 4 ++-- > > drivers/gpu/drm/ttm/ttm_tt.c | 10 +++++----- > > include/drm/ttm/ttm_tt.h | 1 + > > 3 files changed, 8 insertions(+), 7 deletions(-) > > > > diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c > > index f56be5bc0861..e2131c73dcb6 100644 > > --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c > > +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c > > @@ -346,8 +346,8 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf, > > } else if (unlikely(!page)) { > > break; > > } > > - page->index = drm_vma_node_start(&bo->base.vma_node) + > > - page_offset; > > + if (!(bo->ttm->page_flags & TTM_PAGE_FLAG_SHMEM)) > > + page->index = drm_vma_node_start(&bo->base.vma_node) + page_offset; > > I still have a rather bad feeling about that. > > This should either not be necessary any more in general or the shmemfile > approach doesn't work correctly. > > Please send a patch to remove this for everybody instead and we will see > if that really works or not. > > > pfn = page_to_pfn(page); > > } > > > > diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c > > index dae52433beeb..cc4815c1f505 100644 > > --- a/drivers/gpu/drm/ttm/ttm_tt.c > > +++ b/drivers/gpu/drm/ttm/ttm_tt.c > > @@ -293,7 +293,7 @@ static void ttm_tt_add_mapping(struct ttm_device *bdev, struct ttm_tt *ttm) > > { > > pgoff_t i; > > > > - if (ttm->page_flags & TTM_PAGE_FLAG_SG) > > + if (ttm->page_flags & (TTM_PAGE_FLAG_SG | TTM_PAGE_FLAG_SHMEM)) > > Maybe you should re-use the TTM_PAGE_FLAG_SG for this and/or rename the > flag to better describe what it does. > > Something like TTM_PAGE_FLAG_EXTERNAL or similar? The only other use > case for TTM_PAGE_FLAG_SG which comes to my mind is controlling if the > pages array is allocated or not. This seems slightly tricky. We still want ttm_bo_vm_reserve() to behave normally when seeing shmem_tt, and yet it still needs to return SIGBUS or so for FLAG_SG, as per the existing behaviour. Throwing in bo->type == type_sg seems maybe plausible, but at least amdgpu is manually setting FLAG_SG for userptr objects, so I presume bo->type != type_sg here? Otherwise maybe just s/SHMEM/EXTERNAL and leave FLAG_SG as-is? > > Christian. > > > return; > > > > for (i = 0; i < ttm->num_pages; ++i) > > @@ -311,7 +311,7 @@ int ttm_tt_populate(struct ttm_device *bdev, > > if (ttm_tt_is_populated(ttm)) > > return 0; > > > > - if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) { > > + if (!(ttm->page_flags & (TTM_PAGE_FLAG_SG | TTM_PAGE_FLAG_SHMEM))) { > > atomic_long_add(ttm->num_pages, &ttm_pages_allocated); > > if (bdev->pool.use_dma32) > > atomic_long_add(ttm->num_pages, > > @@ -349,7 +349,7 @@ int ttm_tt_populate(struct ttm_device *bdev, > > return 0; > > > > error: > > - if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) { > > + if (!(ttm->page_flags & (TTM_PAGE_FLAG_SG | TTM_PAGE_FLAG_SHMEM))) { > > atomic_long_sub(ttm->num_pages, &ttm_pages_allocated); > > if (bdev->pool.use_dma32) > > atomic_long_sub(ttm->num_pages, > > @@ -364,7 +364,7 @@ static void ttm_tt_clear_mapping(struct ttm_tt *ttm) > > pgoff_t i; > > struct page **page = ttm->pages; > > > > - if (ttm->page_flags & TTM_PAGE_FLAG_SG) > > + if (ttm->page_flags & (TTM_PAGE_FLAG_SG | TTM_PAGE_FLAG_SHMEM)) > > return; > > > > for (i = 0; i < ttm->num_pages; ++i) { > > @@ -384,7 +384,7 @@ void ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm) > > else > > ttm_pool_free(&bdev->pool, ttm); > > > > - if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) { > > + if (!(ttm->page_flags & (TTM_PAGE_FLAG_SG | TTM_PAGE_FLAG_SHMEM))) { > > atomic_long_sub(ttm->num_pages, &ttm_pages_allocated); > > if (bdev->pool.use_dma32) > > atomic_long_sub(ttm->num_pages, > > diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h > > index 89b15d673b22..20d550185065 100644 > > --- a/include/drm/ttm/ttm_tt.h > > +++ b/include/drm/ttm/ttm_tt.h > > @@ -42,6 +42,7 @@ struct ttm_operation_ctx; > > #define TTM_PAGE_FLAG_ZERO_ALLOC (1 << 6) > > #define TTM_PAGE_FLAG_SG (1 << 8) > > #define TTM_PAGE_FLAG_NO_RETRY (1 << 9) > > +#define TTM_PAGE_FLAG_SHMEM (1 << 10) > > > > #define TTM_PAGE_FLAG_PRIV_POPULATED (1 << 31) > > >