On Wed, Jan 08, 2025 at 11:39:35AM +0000, John Garry wrote: > On 08/01/2025 01:26, Darrick J. Wong wrote: > > > > I (vaguely) agree ith that. > > > > > > > > > And only if the file mapping is in the correct state, and the > > > > > program is willing to*maintain* them in the correct state to get the > > > > > better performance. > > > > I kinda agree with that, but the maintain is a bit hard as general > > > > rule of thumb as file mappings can change behind the applications > > > > back. So building interfaces around the concept that there are > > > > entirely stable mappings seems like a bad idea. > > > I tend to agree. > > As long as it's a general rule that file mappings can change even after > > whatever prep work an application tries to do, we're never going to have > > an easy time enabling any of these fancy direct-to-storage tricks like > > cpu loads and stores to pmem, or this block-untorn writes stuff. > > > > > > > I don't want xfs to grow code to write zeroes to > > > > > mapped blocks just so it can then write-untorn to the same blocks. > > > > Agreed. > > Any other ideas on how to achieve this then? > > There was the proposal to create a single bio covering mixed mappings, but > then we had the issue that all the mappings cannot be atomically converted. > I am not sure if this is really such an issue. I know that RWF_ATOMIC means > all or nothing, but partially converted extents (from an atomic write) is a > sort of grey area, as the original unmapped extents had nothing in the first > place. The long way -- introducing a file remap log intent item to guarantee that the ioend processing completes no matter how mixed the mapping might be. > > > > > > > So if we want to allow large writes over mixed extents, how to handle? > > > > > > Note that some time ago we also discussed that we don't want to have a > > > single bio covering mixed extents as we cannot atomically convert all > > > unwritten extents to mapped. > > Fromhttps://lore.kernel.org/linux-xfs/Z3wbqlfoZjisbe1x@xxxxxxxxxxxxx/ : > > > > "I think we should wire it up as a new FALLOC_FL_WRITE_ZEROES mode, > > document very vigorously that it exists to facilitate pure overwrites > > (specifically that it returns EOPNOTSUPP for always-cow files), and not > > add more ioctls." > > > > If we added this new fallocate mode to set up written mappings, would it > > be enough to write in the programming manuals that applications should > > use it to prepare a file for block-untorn writes? > > Sure, that API extension could be useful in the case that we conclude that > we don't permit atomic writes over mixed mappings. > > > Perhaps we should > > change the errno code to EMEDIUMTYPE for the mixed mappings case. > > > > Alternately, maybe we/should/ let programs open a lease-fd on a file > > range, do their untorn writes through the lease fd, and if another > > thread does something to break the lease, then the lease fd returns EIO > > until you close it. > > So do means applications own specific ranges in files for exclusive atomic > writes? Wouldn't that break what we already support today? The application would own a lease on a specific range, but it could pass that fd around. Also you wouldn't need a lease for a single-fsblock untorn write. --D > Cheers, > John > >