Re: [RFC] drm/vgem: Don't use get_page in fault handler.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 07.07.20 um 14:05 schrieb Thomas Hellström (Intel):

On 7/7/20 1:07 PM, Chris Wilson wrote:
Quoting Christian König (2020-07-07 11:58:26)
Am 07.07.20 um 10:59 schrieb Daniel Vetter:
On Tue, Jul 7, 2020 at 9:27 AM Lepton Wu <ytht.net@xxxxxxxxx> wrote:
For pages which are allocated in ttm with transparent huge pages,
tail pages have zero as reference count. The current vgem code use
get_page on them and it will trigger BUG when release_pages get called
on those pages later.

Here I attach the minimal code to trigger this bug on a qemu VM which
enables virtio gpu (card1) and vgem (card0). BTW, since the upstream
virtio gpu has changed to use drm gem and moved away from ttm. So we
have to use an old kernel like 5.4 to reproduce it. But I guess
same thing could happen for a real GPU if the real GPU use similar code path to allocate pages from ttm. I am not sure about two things: first, do we need to fix this? will a real GPU hit this code path? Second, suppose this need to be fixed, should this be fixed in ttm side or vgem side?  If we remove "huge_flags &= ~__GFP_COMP" from ttm_get_pages, then get_page in vgem works
fine. But it's there for fix another bug:
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.freedesktop.org%2Fshow_bug.cgi%3Fid%3D103138&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7C6c4170559c0a4d0d5f5708d8226e0f97%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637297203404761846&amp;sdata=qXBQBKvBM7zvi6PIkcAPfBjmOC8UrKmjnO03hWKzNHU%3D&amp;reserved=0 It could also be "fixed" with this patch. But I am really not sure if this is
the way to go.
For imported dma-buf objects, vgem should not handle this itself, but
forward to the exporter through the dma_buf_mmap stuff. We have
helpers for this all now, probably just not wired up correctly. Trying
to ensure that all combinations of mmap code across all drivers work
the same doesn't work.
Yes, Daniel is right what vgem does here is fundamentally broken in many
ways.

First of all nobody should ever call get_page() on the pages TTM has
allocated. Those pages come out of a block box and should not be touched
by anybody.
It doesn't. It's only used on the pages vgem allocates (from shmemfs). So
I'm really confused as to how ttm gets involved here.
-Chris

Sounds like vgem is allowing mmap of imported dma-bufs?

Yeah agree, this is most likely the underlying problem and should be fixed instead.

Christian.


/Thomas

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7C6c4170559c0a4d0d5f5708d8226e0f97%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637297203404761846&amp;sdata=Ny4sdibo9o8%2F8UqxdP74HaOXQjoKkmd%2FrRAPHszr3rk%3D&amp;reserved=0

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux