On 19.10.22 12:43, Oleksii Moisieiev wrote:
Greetings. I need your advise about the problem I'm facing right now: I'm working on the new type of dmabuf export implementation. This is going to be new ioctl to the gntdev and it's purpose is to use external buffer, provided by file descriptor as the backing storage during export to grant refs. Few words about the functionality I'm working on right now: My setup is based on IMX8QM (please see PS below if you need configuration details) When using dma-buf exporter to create dma-buf with backing storage and map it to the grant refs, provided from the domain, we've met a problem, that several HW (i.MX8 gpu in our case) do not support external buffer and requires backing storage to be created using it's native tools (eglCreateImageKHR returns EGL_NO_IMAGE_KHR for buffers, which were not created using gbm_bo_create). That's why new ioctls were added to be able to pass existing dma-buffer fd as input parameter and use it as backing storage to export to refs. Kernel version on IMX8QM board is 5.10.72 and itworks fine on this kernel version. New ioctls source code can be found here: https://github.com/oleksiimoisieiev/linux/tree/gntdev_map_buf_upstr_for-linus-6.1_2Now regarding the problem I've met when rebased those code on master:On my test stand I use DRM_IOCTL_MODE_CREATE_DUMB/DRM_IOCTL_MODE_DESTROY_DUMB ioctls to allocate buffer and I'm observing the following backtrace on DRM_IOCTL_MODE_DESTROY_DUMB: Unable to handle kernel paging request at virtual address 0000000387000098 Mem abort info: ESR = 0x0000000096000005 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x05: level 1 translation fault Data abort info: ISV = 0, ISS = 0x00000005 CM = 0, WnR = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=000000006df98000 [0000000387000098] pgd=0800000064f4f003, p4d=0800000064f4f003, pud=0000000000000000 Internal error: Oops: 96000005 [#1] PREEMPT SMP Modules linked in: xen_pciback overlay crct10dif_ce ip_tables x_tables ipv6 PU: 0 PID: 34 Comm: kworker/0:1 Not tainted 6.0.0 #85 Hardware name: linux,dummy-virt (DT) Workqueue: events virtio_gpu_dequeue_ctrl_func pstate: 000000c5 (nzcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : check_move_unevictable_folios+0xb8/0x4d0 lr : check_move_unevictable_folios+0xb4/0x4d0 sp : ffff8000081a3ad0 x29: ffff8000081a3ad0 x28: ffff03856ac98800 x27: 0000000000000000 x26: ffffde7b168ee9d8 x25: ffff03856ae26008 x24: 0000000000000000 x23: ffffde7b1758d6c0 x22: 0000000000000001 x21: ffff8000081a3b68 x20: 0000000000000001 x19: fffffc0e15935040 x18: ffffffffffffffff x17: ffff250a68e3d000 x16: 0000000000000012 x15: ffff8000881a38d7 x14: 0000000000000000 x13: ffffde7b175a3150 x12: 0000000000002c55 x11: 0000000000000ec7 x10: ffffde7b176113f8 x9 : ffffde7b175a3150 x8 : 0000000100004ec7 x7 : ffffde7b175fb150 x6 : ffff8000081a3b70 x5 : 0000000000000001 x4 : 0000000000000000 x3 : ffff03856ac98850 x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000387000000 Call trace: check_move_unevictable_folios+0xb8/0x4d0 check_move_unevictable_pages+0x8c/0x110 drm_gem_put_pages+0x118/0x198 drm_gem_shmem_put_pages_locked+0x4c/0x70 drm_gem_shmem_unpin+0x30/0x50 virtio_gpu_cleanup_object+0x84/0x130 virtio_gpu_cmd_unref_cb+0x18/0x2c virtio_gpu_dequeue_ctrl_func+0x124/0x290 process_one_work+0x1d0/0x320 worker_thread+0x14c/0x444 kthread+0x10c/0x110 ret_from_fork+0x10/0x20 Code: 97fc3fe1 aa1303e0 94003ac7 b4000080 (f9404c00) ---[ end trace 0000000000000000 ]--- After some investigation I think I've found the cause of the problem: This is the functionality, added in commit 3f9f1c67572f5e5e6dc84216d48d1480f3c4fcf6 which introduces safe mechanism to unmap grant pages which is waiting until page_count(page) = 1 before doing unmap. The problem is that DRM_IOCTL_MODE_CREATE_DUMB creates buffer where page_count(page) = 2. On my QEMU test stand I'm using Xen 4.16 (aarch64) with debian based Dom0 + DomU on the latest kernels. I've created some apps for testing: The first one is to allocate grant refs on DomU: https://github.com/oleksiimoisieiev/linux/tree/gntdev_map_buf_upstr_for-linus-6.1_2 The name is test.ko and it can be built using command: cd ./test; make NOTE: makefile expects kernel build to be present on ../../test-build directory. It should be run on DomU using command: insmod test.ko; echo "create" > /sys/kernel/etx_sysfs/etx_value Result will be the following: [ 126.104903] test: loading out-of-tree module taints kernel. [ 126.150586] Sysfs - Write!!! [ 126.150773] create [ 126.150773] [ 126.150888] Hello, World! [ 126.151203] Hello, World! [ 126.151324] gref 301 [ 126.151376] addr ffff00000883d000 [ 126.151431] gref 302 [ 126.151454] addr ffff00000883e000 [ 126.151478] gref 303 [ 126.151497] addr ffff00000883f000 [ 126.151525] gref 304 [ 126.151546] addr ffff000008840000 [ 126.151573] gref 305 [ 126.151593] addr ffff000008841000 The second is for dom0 and can be found here: https://github.com/oleksiimoisieiev/xen/tree/gntdev_fd How to build: make -C tools/console all Result: ./tools/console/gnt_test should be uploaded to Dom0 Start: sudo ./gnt_test_map 1 301 302 303 304 305 Where 1 is DomU ID and 301 302 303 304 305 - grefs from test.ko output This will create buffer using ioctls DRM_IOCTL_MODE_CREATE_DUMB them passes it as backing storage to gntdev and then destroys it using DRM_IOCTL_MODE_DESTROY_DUMB. The problem is that when dumb buffer is created we observe page_count(page) = 2. So when before buffer release I'm trying to unmap grant refs using unmap_grant_pages it is calling __gnttab_unmap_refs_async, which postpones actual unmapping to 5 ms because page_count(page) > 1. Which causes drm_gem_get_pages to try to free pages, which are still mapped. Also if I change in the following line: https://github.com/torvalds/linux/blob/bb1a1146467ad812bb65440696df0782e2bc63c8/drivers/xen/grant-table.c#L1313 change from page_count(item->pages[pc]) > 1 to page_count(item->pages[pc]) > 2 - everything works fine. The obvious way for fix this issue I see is to make the expected page_count for __gnttab_unmap_refs_async configurable for each buffer, but I'm now sure if this is the best solution. I would be happy to hear your thoughts and advises about how to fix this situation.
My first thought would be to save the page_count() of each page when doing the map operation, and then compare to that value. The natural place to store this count would be struct xen_page_foreign, but there are only 16 bits free for the 64-bit system case (it is using the struct page->private field for that purpose), so you'd need to bail out in case page_count() is > 65535. Juergen
Attachment:
OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key
Attachment:
OpenPGP_signature
Description: OpenPGP digital signature