On 17/09/2024 23:27, Dave Chinner wrote:
# xfs_bmap -vvp mnt/file
mnt/file:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS
0: [0..15]: 384..399 0 (384..399) 16 010000
1: [16..31]: 400..415 0 (400..415) 16 000000
2: [32..127]: 416..511 0 (416..511) 96 010000
3: [128..255]: 256..383 0 (256..383) 128 000000
FLAG Values:
0010000 Unwritten preallocated extent
Here we have unaligned extents wrt extsize.
The sub-alloc unit zeroing would solve that - is that what you would still
advocate (to solve that issue)?
Yes, I thought that was already implemented for force-align with the
DIO code via the extsize zero-around changes in the iomap code. Why
isn't that zero-around code ensuring the correct extent layout here?
I just have not included the extsize zero-around changes here. They were
just grouped with the atomic writes support, as they were added
specifically for the atomic writes support. Indeed - to me at least - it
is strange that the DIO code changes are required for XFS forcealign
implementation. And, even if we use extsize zero-around changes for DIO
path, what about buffered IO?
BTW, I still have concern with this extsize zero-around change which I
was making:
xfs_iomap_write_unwritten()
{
unsigned int rounding;
/* when converting anything unwritten, we must be spanning an alloc
unit, so round up/down */
if (rounding > 1) {
offset_fsb = rounddown(rounding);
count_fsb = roundup(rounding);
}
...
do {
xfs_bmapi_write();
...
xfs_trans_commit();
} while ();
}
As mentioned elsewhere, it's a bit of a bodge (to do this rounding).
FWIW, I also understand things are different if we are doing 128kB
atomic writes on 16kB force aligned files. However, in this
situation we are treating the 128kB atomic IO as eight individual
16kB atomic IOs that are physically contiguous.
Yes, if 16kB force aligned, userspace can only issue 16KB atomic writes.
Right, but the eventual goal (given the statx parameters) is to be
able to do 8x16kB sequential atomic writes as a single 128kB IO, yes?
No, if atomic write unit max is 16KB, then userspace can only issue a
single 16KB atomic write.
However, some things to consider:
a. the block layer may merge those 16KB atomic writes
b. userspace may also merge 16KB atomic writes and issue a larger atomic
write (if atomic write unit max is > 16KB)
I had been wondering if there is any value in a lib for helping with b.
Again, this is different to the traditional RT file behaviour - it
can use unwritten extents for sub-alloc-unit alignment unmaps
because the RT device can align file offset to any physical offset,
and issue unaligned sector sized IO without any restrictions. Forced
alignment does not have this freedom, and when we extend forced
alignment to RT files, it will not have the freedom to use
unwritten extents for sub-alloc-unit unmapping, either.
So how do you think that we should actually implement
xfs_itruncate_extents_flags() properly for forcealign? Would it simply be
like:
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1050,7 +1050,7 @@ xfs_itruncate_extents_flags(
WARN_ON_ONCE(first_unmap_block > XFS_MAX_FILEOFF);
return 0;
}
+ if (xfs_inode_has_forcealign(ip))
+ first_unmap_block = xfs_inode_roundup_alloc_unit(ip,
first_unmap_block);
error = xfs_bunmapi_range(&tp, ip, flags, first_unmap_block,
Yes, it would be something like that, except it would have to be
done before first_unmap_block is verified.
ok, and are you still of the opinion that this does not apply to rtvol?
The rtvol is*not* force-aligned. It -may- have some aligned
allocation requirements that are similar (i.e. sb_rextsize > 1 fsb)
but it does*not* force-align extents, written or unwritten.
The moment we add force-align support to RT files (as is the plan),
then the force-aligned inodes on the rtvol will need to behave as
force aligned inodes, not "rtvol" inodes.
ok, fine
Thanks,
John