From: Jan Kara <jack@xxxxxxx> > I'm a bit confused. Above you write that: > > "The memory allocation workflow begins in the userspace, which creates a new > file backed by 2MiB hugepages with memfd_create(MFD_HUGETLB, MFD_HUGE_2MB) > and fallocate(). Then the userspace makes an IOCTL to the kernel module > with the file descriptor and size so that the kernel module can get the > struct page with find_get_page()." > > So the memory allocation actually does happen from fallocate(2) as far as I > can tell. What guys are suggesting is that instead of passing the prepared > 'fd' to ioctl(2), your application should mmap the file and pass the > address of the mmapped area. That's how things are usually done and it also > gives userspace more freedom over how it prepares buffers for DMA. Also then > pin_user_pages() comes as a natural API to use in the driver. > I failed to explain that the kernel module might call vfs_fallocate() to allocate hugepages, then find_get_page() and finally dma_map_single(), all before the userspace maps it. Sorry for the confusion. > Now I'm not sure whether changing the ioctl(2) is still an option for you. > If not, then you have to resort to some kind of workaround as you > mentioned. But still pin_user_pages(FOLL_LONGTERM) is definitely the API > you should be using for telling the kernel you are going to DMA into these > pages and want to hold onto them for a long time. > Changing the application workflow and then doing ioctl() with the address is what I ideally want with either find_get_page() alone or vm_mmap() with pin_user_pages() as a workaround, and the latter is preferred. Thanks, Ivan