On Thu, Feb 09, 2023 at 08:47:07PM -0800, Linus Torvalds wrote: > On Thu, Feb 9, 2023 at 8:06 PM Dave Chinner <david@xxxxxxxxxxxxx> wrote: > >> > > So while I was pondering the complexity of this and watching a great > > big shiny rocket create lots of heat, light and noise, it occurred > > to me that we already have a mechanism for preventing page cache > > data from being changed while the folios are under IO: > > SB_I_STABLE_WRITES and folio_wait_stable(). > > No, Dave. Not at all. > > Stop and think. I have. > splice() is not some "while under IO" thing. It's *UNBOUNDED*. Splice has two sides - a source where we splice to the transport pipe, then a destination where we splice pages from the transport pipe. For better or worse, time in the transport pipe is unbounded, but that does not mean the srouce or destination have unbound processing times. However, transport times being unbound are largely irrelevant, and miss the fact that the application does not require pages in transit to be stable. The application we are talking about here is file -> pipe -> network stack for zero copy sending of static file data and the problem is that the file pages are not stable whilst they are under IO in the network stack. IOWs, the application does not care if the data changes whilst they are in transport attached to the pipe - it only cares that the contents are stable once they have been delivered and are now wholly owned by the network stack IO path so that the OTW encodings (checksum, encryption, whatever) done within the network IO path don't get compromised. i.e. the file pages only need to be stable whilst the network stack IO path checksums and DMAs the data to the network hardware. That's exactly the same IO context that the block device stack requires the page contents to be stable - across parity/checksum calculations and the subsequent DMA transfers to the storage hardware. I'm suggesting that the page should only need to be held stable whilst it is under IO, whether that IO is in the network stack via skbs or in the block device stack via bios. Both network and block IO are bounded by fixed time limits, both IO paths typically only need pages held stable for a few milliseconds at a time, and both have worst case IO times in error situations are typically bound at a few minutes. IOWs, splice is a complete misdirection here - it doesn't need to know a thing about stable data requirements at all. It's the destination processing that requires stable data, not the transport mechanism. Hence if we have a generic mechanism that the network stack can use to detect a file backed page and mark it needing to be stable whilst the network stack is doing IO on it, everything on the filesystem side should just work like it does for pages under IO in the block device stack... Indeed, I suspect that a filesystem -> pipe -> filesystem zero copy path via splice probably also needs stable source pages for some filesystems, in which case we need exactly the same mechanism as we need for stable pages in the network stack zero copy splice destiantion path.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx