On Tue, Mar 12, 2019 at 03:39:33AM -0700, Ira Weiny wrote: > IMHO I don't think that the copy_file_range() is going to carry us through the > next wave of user performance requirements. RDMA, while the first, is not the > only technology which is looking to have direct access to files. XDP is > another.[1] Sure, all I doing here was demonstrating that people have been trying to get local direct access to file mappings to DMA directly into them for a long time. Direct Io games like these are now largely unnecessary because we now have much better APIs to do zero-copy data transfer between files (which can do hardware offload if it is available!). It's the long term pins that RDMA does that are the problem here. I'm asssuming that for XDP, you're talking about userspace zero copy from files to the network hardware and vice versa? transmit is simple (read-only mapping), but receive probably requires bpf programs to ensure that data (minus headers) in the incoming packet stream is correctly placed into the UMEM region? XDP receive seems pretty much like the same problem as RDMA writes into the file. i.e. the incoming write DMAs are going to have to trigger page faults if the UMEM is a long term pin so the filesystem behaves correctly with this remote data placement. I'd suggest that RDMA, XDP and anything other hardware that is going to pin file-backed mappings for the long term need to use the same "inform the fs of a write operation into it's mapping" mechanisms... And if we start talking about wanting to do peer-to-peer DMA from network/GPU device to storage device without going through a file-backed CPU mapping, we still need to have the filesystem involved to translate file offsets to storage locations the filesystem has allocated for the data and to lock them down for as long as the peer-to-peer DMA offload is in place. In effect, this is the same problem as RDMA+FS-DAXs - the filesystem owns the file offset to storage location mapping and manages storage access arbitration, not the mm/vma mapping presented to userspace.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx