On Tue, Jul 16, 2024 at 10:48:40AM +0800, Huan Yang wrote: > I just research the udmabuf, Please correct me if I'm wrong. > > 在 2024/7/15 20:32, Christian König 写道: > > Am 15.07.24 um 11:11 schrieb Daniel Vetter: > > > On Thu, Jul 11, 2024 at 11:00:02AM +0200, Christian König wrote: > > > > Am 11.07.24 um 09:42 schrieb Huan Yang: > > > > > Some user may need load file into dma-buf, current > > > > > way is: > > > > > 1. allocate a dma-buf, get dma-buf fd > > > > > 2. mmap dma-buf fd into vaddr > > > > > 3. read(file_fd, vaddr, fsz) > > > > > This is too heavy if fsz reached to GB. > > > > You need to describe a bit more why that is to heavy. I can only > > > > assume you > > > > need to save memory bandwidth and avoid the extra copy with the CPU. > > > > > > > > > This patch implement a feature called DMA_HEAP_IOCTL_ALLOC_READ_FILE. > > > > > User need to offer a file_fd which you want to load into > > > > > dma-buf, then, > > > > > it promise if you got a dma-buf fd, it will contains the file content. > > > > Interesting idea, that has at least more potential than trying > > > > to enable > > > > direct I/O on mmap()ed DMA-bufs. > > > > > > > > The approach with the new IOCTL might not work because it is a very > > > > specialized use case. > > > > > > > > But IIRC there was a copy_file_range callback in the file_operations > > > > structure you could use for that. I'm just not sure when and how > > > > that's used > > > > with the copy_file_range() system call. > > > I'm not sure any of those help, because internally they're all still > > > based > > > on struct page (or maybe in the future on folios). And that's the thing > > > dma-buf can't give you, at least without peaking behind the curtain. > > > > > > I think an entirely different option would be malloc+udmabuf. That > > > essentially handles the impendence-mismatch between direct I/O and > > > dma-buf > > > on the dma-buf side. The downside is that it'll make the permanently > > > pinned memory accounting and tracking issues even more apparent, but I > > > guess eventually we do need to sort that one out. > > > > Oh, very good idea! > > Just one minor correction: it's not malloc+udmabuf, but rather > > create_memfd()+udmabuf. Hm right, it's create_memfd() + mmap(memfd) + udmabuf > > And you need to complete your direct I/O before creating the udmabuf > > since that reference will prevent direct I/O from working. > > udmabuf will pin all pages, so, if returned fd, can't trigger direct I/O > (same as dmabuf). So, must complete read before pin it. Why does pinning prevent direct I/O? I haven't tested, but I'd expect the rdma folks would be really annoyed if that's the case ... > But current way is use `memfd_pin_folios` to boost alloc and pin, so maybe > need suit it. > > > I currently doubt that the udmabuf solution is suitable for our > gigabyte-level read operations. > > 1. The current mmap operation uses faulting, so frequent page faults will be > triggered during reads, resulting in a lot of context switching overhead. > > 2. current udmabuf size limit is 64MB, even can change, maybe not good to > use in large size? Yeah that's just a figleaf so we don't have to bother about the accounting issue. > 3. The migration and adaptation of the driver is also a challenge, and > currently, we are unable to control it. Why does a udmabuf fd not work instead of any other dmabuf fd? That shouldn't matter for the consuming driver ... > Perhaps implementing `copy_file_range` would be more suitable for us. See my other mail, fundamentally these all rely on struct page being present, and dma-buf doesn't give you that. Which means you need to go below the dma-buf abstraction. And udmabuf is pretty much the thing for that, because it wraps normal struct page memory into a dmabuf. And copy_file_range on the underlying memfd might already work, I haven't checked though. Cheers, Sima -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch