On Mon, Aug 15, 2022 at 12:05:19PM +0200, Christian König wrote: > Am 15.08.22 um 11:54 schrieb Dmitry Osipenko: > > Higher order pages allocated using alloc_pages() aren't refcounted and they > > need to be refcounted, otherwise it's impossible to map them by KVM. This > > patch sets the refcount of the tail pages and fixes the KVM memory mapping > > faults. > > > > Without this change guest virgl driver can't map host buffers into guest > > and can't provide OpenGL 4.5 profile support to the guest. The host > > mappings are also needed for enabling the Venus driver using host GPU > > drivers that are utilizing TTM. > > > > Based on a patch proposed by Trigger Huang. > > Well I can't count how often I have repeated this: This is an absolutely > clear NAK! > > TTM pages are not reference counted in the first place and because of this > giving them to virgl is illegal. > > Please immediately stop this completely broken approach. We have discussed > this multiple times now. Yeah we need to get this stuff closed for real by tagging them all with VM_IO or VM_PFNMAP asap. It seems ot be a recurring amount of fun that people try to mmap dma-buf and then call get_user_pages on them. Which just doesn't work. I guess this is also why Rob Clark send out that dma-buf patch to expos mapping information (i.e. wc vs wb vs uncached). There seems to be some serious bonghits going on :-/ -Daniel > > Regards, > Christian. > > > > > Cc: stable@xxxxxxxxxxxxxxx > > Cc: Trigger Huang <Trigger.Huang@xxxxxxxxx> > > Link: https://www.collabora.com/news-and-blog/blog/2021/11/26/venus-on-qemu-enabling-new-virtual-vulkan-driver/#qcom1343 > > Tested-by: Dmitry Osipenko <dmitry.osipenko@xxxxxxxxxxxxx> # AMDGPU (Qemu and crosvm) > > Signed-off-by: Dmitry Osipenko <dmitry.osipenko@xxxxxxxxxxxxx> > > --- > > drivers/gpu/drm/ttm/ttm_pool.c | 25 ++++++++++++++++++++++++- > > 1 file changed, 24 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c > > index 21b61631f73a..11e92bb149c9 100644 > > --- a/drivers/gpu/drm/ttm/ttm_pool.c > > +++ b/drivers/gpu/drm/ttm/ttm_pool.c > > @@ -81,6 +81,7 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags, > > unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS; > > struct ttm_pool_dma *dma; > > struct page *p; > > + unsigned int i; > > void *vaddr; > > /* Don't set the __GFP_COMP flag for higher order allocations. > > @@ -93,8 +94,10 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags, > > if (!pool->use_dma_alloc) { > > p = alloc_pages(gfp_flags, order); > > - if (p) > > + if (p) { > > p->private = order; > > + goto ref_tail_pages; > > + } > > return p; > > } > > @@ -120,6 +123,23 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags, > > dma->vaddr = (unsigned long)vaddr | order; > > p->private = (unsigned long)dma; > > + > > +ref_tail_pages: > > + /* > > + * KVM requires mapped tail pages to be refcounted because put_page() > > + * is invoked on them in the end of the page fault handling, and thus, > > + * tail pages need to be protected from the premature releasing. > > + * In fact, KVM page fault handler refuses to map tail pages to guest > > + * if they aren't refcounted because hva_to_pfn_remapped() checks the > > + * refcount specifically for this case. > > + * > > + * In particular, unreferenced tail pages result in a KVM "Bad address" > > + * failure for VMMs that use VirtIO-GPU when guest's Mesa VirGL driver > > + * accesses mapped host TTM buffer that contains tail pages. > > + */ > > + for (i = 1; i < 1 << order; i++) > > + page_ref_inc(p + i); > > + > > return p; > > error_free: > > @@ -133,6 +153,7 @@ static void ttm_pool_free_page(struct ttm_pool *pool, enum ttm_caching caching, > > { > > unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS; > > struct ttm_pool_dma *dma; > > + unsigned int i; > > void *vaddr; > > #ifdef CONFIG_X86 > > @@ -142,6 +163,8 @@ static void ttm_pool_free_page(struct ttm_pool *pool, enum ttm_caching caching, > > if (caching != ttm_cached && !PageHighMem(p)) > > set_pages_wb(p, 1 << order); > > #endif > > + for (i = 1; i < 1 << order; i++) > > + page_ref_dec(p + i); > > if (!pool || !pool->use_dma_alloc) { > > __free_pages(p, order); > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch