On Wed, Mar 19, 2025 at 10:57:29PM -0700, Christoph Hellwig wrote: > On Wed, Mar 19, 2025 at 10:45:22AM -0700, Joe Damato wrote: > > I don't disagree; I just don't know if app developers: > > a.) know that this is possible to do, and > > b.) know how to do it > > So if you don't know that why do you even do the work? I am doing the work because I use splice and sendfile and it seems relatively straightforward to make them safer using an existing mechanism, at least for network sockets. After dropping the sendfile2 patches completely, it looks like in my new set all of the code is within CONFIG_NET defines in fs/splice.c. > > In general: it does seem a bit odd to me that there isn't a safe > > sendfile syscall in Linux that uses existing completion notification > > mechanisms. > > Agreed. Where the existing notification mechanism is called io_uring. Sure. As I mentioned to Jens: I agree that any new system call should be built differently. But does that mean we should leave splice and sendfile as-is when there is a way to potentially make them safer? In my other message to Jens I proposed: - SPLICE_F_ZC for splice to generate zc completion notifications to the error queue - Modifying sendfile so that if SO_ZEROCOPY (which already exists) is set on a network socket, zc completion notifications are generated. In both cases no new system call is needed and both splice and sendfile become safer to use. At some point in the future a mechanism built on top of iouring introduced as new system calls (sendmsg2, sendfile2, splice2, etc) can be built. > > Of course, I certainly agree that the error queue is a work around. > > But it works, app use it, and its fairly well known. I don't see any > > reason, other than historical context, why sendmsg can use this > > mechanism, splice can, but sendfile shouldn't? > > Because sendmsg should never have done that it certainly should not > spread beyond purely socket specific syscalls. I don't know the entire historical context, but I presume sendmsg did that because there was no other mechanism at the time. I will explain it more clearly in the next cover letter, but the way I see the situation is: - There are existing system calls which operate on network sockets (splice and sendfile) that avoid copies - There is a mechanism already in the kernel in the networking stack for generating completion notifications - Both splice and sendfile could be extended to support this for network sockets so they can be used more safely, without introducing a new system call > > If you feel very strongly that this cannot be merged without > > dropping sendfile2 and only plumbing this through for splice, then > > I'll drop the sendfile2 syscall when I submit officially (probably > > next week?). > > Splice should also not do "error queue notifications". Nothing > new and certainly nothing outside of net/ should. It seems like Jens suggested that plumbing this through for splice was a possibility, but sounds like you disagree. Not really sure how to proceed here? If code I am modifying is within CONFIG_NET defines, but lives in fs/splice.c ... is that within the realm of net or fs ? I am asking because I genuinely don't know. As mentioned above and in other messages, it seems like it is possible to improve the networking parts of splice (and therefore sendfile) to make them safer to use without introducing a new system call. Are you saying that you are against doing that, even if the code is network specific (but lives in fs/)? > > I do feel pretty strongly that it's more likely apps would use > > sendfile2 and we'd have safer apps out in the wild. But, I could be > > wrong. > > A purely synchronous sendfile that is safe is a good thing. Spreading > non-standard out of band notifications is not. How to build that > safe sendmsg is a good question, and a sendmsg2 might be a sane > option for that. The important thing is that the underlying code > should use iocbs and ki_complete to notify I/O completion so that > all the existing infrastucture like io_uring and in-kernel callers > can reuse this. I'm not currently planning to build sendmsg2 (and I've already mentioned to Jens and above I will drop sendfile2), but if I have the time it sounds like an interesting project.