Quoting Thomas Hellström (Intel) (2020-07-07 13:05:31) > > On 7/7/20 1:07 PM, Chris Wilson wrote: > > Quoting Christian König (2020-07-07 11:58:26) > >> Am 07.07.20 um 10:59 schrieb Daniel Vetter: > >>> On Tue, Jul 7, 2020 at 9:27 AM Lepton Wu <ytht.net@xxxxxxxxx> wrote: > >>>> For pages which are allocated in ttm with transparent huge pages, > >>>> tail pages have zero as reference count. The current vgem code use > >>>> get_page on them and it will trigger BUG when release_pages get called > >>>> on those pages later. > >>>> > >>>> Here I attach the minimal code to trigger this bug on a qemu VM which > >>>> enables virtio gpu (card1) and vgem (card0). BTW, since the upstream > >>>> virtio gpu has changed to use drm gem and moved away from ttm. So we > >>>> have to use an old kernel like 5.4 to reproduce it. But I guess > >>>> same thing could happen for a real GPU if the real GPU use similar code > >>>> path to allocate pages from ttm. I am not sure about two things: first, do we > >>>> need to fix this? will a real GPU hit this code path? Second, suppose this > >>>> need to be fixed, should this be fixed in ttm side or vgem side? If we remove > >>>> "huge_flags &= ~__GFP_COMP" from ttm_get_pages, then get_page in vgem works > >>>> fine. But it's there for fix another bug: > >>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.freedesktop.org%2Fshow_bug.cgi%3Fid%3D103138&data=02%7C01%7Cchristian.koenig%40amd.com%7Cfc0831be8fd848abfd8908d82254266d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637297092113216357&sdata=S%2BtLJyB8mqyyE%2BSIjbfHM6FGFuFjrfI50tahpaoJ3wU%3D&reserved=0 > >>>> It could also be "fixed" with this patch. But I am really not sure if this is > >>>> the way to go. > >>> For imported dma-buf objects, vgem should not handle this itself, but > >>> forward to the exporter through the dma_buf_mmap stuff. We have > >>> helpers for this all now, probably just not wired up correctly. Trying > >>> to ensure that all combinations of mmap code across all drivers work > >>> the same doesn't work. > >> Yes, Daniel is right what vgem does here is fundamentally broken in many > >> ways. > >> > >> First of all nobody should ever call get_page() on the pages TTM has > >> allocated. Those pages come out of a block box and should not be touched > >> by anybody. > > It doesn't. It's only used on the pages vgem allocates (from shmemfs). So > > I'm really confused as to how ttm gets involved here. > > -Chris > > Sounds like vgem is allowing mmap of imported dma-bufs? It has vgem_prime_mmap(): if (!obj->filp) return -ENODEV; Ah, vgem_prime_import_sg_table is where the wires get crossed. Oh my, and that even calls __vgem_gem_create() so it ends up with a dummy obj->filp from drm_gem_object_init. -Chris _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel