On 05/04/2024 11:20, Kent Overstreet wrote:
The thing is that there's no requirement for an interface as complex as
the one you're proposing here. I've talked to a few database people
and all they want is to increase the untorn write boundary from "one
disc block" to one database block, typically 8kB or 16kB.
So they would be quite happy with a much simpler interface where they
set the inode block size at inode creation time, and then all writes to
that inode were guaranteed to be untorn. This would also be simpler to
implement for buffered writes.
You're conflating filesystem functionality that applications will use
with hardware and block-layer enablement that filesystems and
filesystem utilities need to configure the filesystem in ways that
allow users to make use of atomic write capability of the hardware.
The block layer functionality needs to export everything that the
hardware can do and filesystems will make use of. The actual
application usage and setup of atomic writes at the filesystem/page
cache layer is a separate problem. i.e. The block layer interfaces
need only support direct IO and expose limits for issuing atomic
direct IO, and nothing more. All the more complex stuff to make it
"easy to use" is filesystem level functionality and completely
outside the scope of this patchset....
A CoW filesystem can implement atomic writes without any block device
support. It seems to me that might have been the easier place to start -
start by getting the APIs right, then do all the plumbing for efficient
untorn writes on non CoW filesystems...
03/10 and 04/10 in this series define the user API, i.e. RWF_ATOMIC and
statx updates.
Any filesystem-specific changes - like in
https://lore.kernel.org/linux-xfs/20240304130428.13026-1-john.g.garry@xxxxxxxxxx/
- are just for enabling this API for that filesystem.