On Wed, Jun 28, 2023 at 07:27:26PM +0100, David Howells wrote: > Matt Whitlock <kernel@xxxxxxxxxxxxxxxxx> wrote: > > > In other words, the currently implemented behavior is appropriate for > > SPLICE_F_MOVE, but it is not appropriate for ~SPLICE_F_MOVE. > > The problems with SPLICE_F_MOVE is that it's only applicable to splicing *out* > of a pipe. By the time you get that far the pages can already be corrupted by > a shared-writable mmap or write(). That's not documented in the man page. Indeed, I think Matt's point - and mine, too, for that matter - is that the splice(2) man page documents *none* of this "copy-by-reference" behaviour or it's side effects. All the man page documents is that the data is *copied in kernel-space* rather than needing kernel->user->kernel data movement to copy it from one fd to the other. The man page *heavily implies* that splice is a "fast immediate data copy". It most definitely does not describe any "zero-copy with whacky post-completion data stream corrupting side effects" mechanisms. There's not even an entry in the "notes" or "bugs" section to warn users that they cannot trust the contents of the source or destination pipe to be what they think they might be as the "data copy" implied by the pipe buffer might not occur until some arbitrary time in the future. Hence, according to the man page, what it is doing right now definitely contrary to the behaviour implied by the documentation... i.e. If the data that is "copied" to the destination pipe is not resolved until some future action by some 3rd party process is performed, then the man page must tell users they cannot use this for any sort of data stream where they require the data being transferred needs to remain stable as of the time of the splice operation. -Dave. -- Dave Chinner david@xxxxxxxxxxxxx