On 05/12/2024 06:30, Darrick J. Wong wrote:
On Thu, Dec 05, 2024 at 07:35:45AM +1100, Dave Chinner wrote:
On Wed, Dec 04, 2024 at 03:43:41PM +0000, John Garry wrote:
From: "Ritesh Harjani (IBM)" <ritesh.list@xxxxxxxxx>
Filesystems like ext4 can submit writes in multiples of blocksizes.
But we still can't allow the writes to be split into multiple BIOs. Hence
let's check if the iomap_length() is same as iter->len or not.
It is the responsibility of userspace to ensure that a write does not span
mixed unwritten and mapped extents (which would lead to multiple BIOs).
How is "userspace" supposed to do this?
No existing utility in userspace is aware of atomic write limits or
rtextsize configs, so how does "userspace" ensure everything is
laid out in a manner compatible with atomic writes?
e.g. restoring a backup (or other disaster recovery procedures) is
going to have to lay the files out correctly for atomic writes.
backup tools often sparsify the data set and so what gets restored
will not have the same layout as the original data set...
Where's the documentation that outlines all the restrictions on
userspace behaviour to prevent this sort of problem being triggered?
Common operations such as truncate, hole punch, buffered writes,
reflinks, etc will trip over this, so application developers, users
and admins really need to know what they should be doing to avoid
stepping on this landmine...
I'm kinda assuming that this requires forcealign to get the extent
alignments correct, and writing zeroes non-atomically if the extent
state gets mixed up before retrying the untorn write. John?
Sure, the code to do the automatic pre-zeroing and retry the atomic
write is not super complicated.
It's just a matter or whether we add it or not.
Thanks,
John