On 05/09/2024 22:47, Dave Chinner wrote:
If the start or end of the extent which needs unmapping is
unaligned then we convert that extent to unwritten and skip,
is it? (__xfs_bunmapi())
The high level code should be aligning the start and end of the
file range to be removed via xfs_inode_alloc_unitsize().
Is that the case for something like truncate? There we just say what is the
end block which we want to truncate to in
xfs_itruncate_extents_flags(new_size) ->
xfs_bunmapi_range(XFS_B_TO_FSB(new_size)), and that may not be alloc unit
aligned.
Ah, I thought we had that alignment in xfs_itruncate_extents_flags()
already, but if we don't then that's a bug that needs to be fixed.
AFAICS, forcealign behaviour is same as RT, so then this would be a
mainline bug, right?
> > We change the space reservation in xfs-setattr_size() for this case
(patch 9) but then don't do any alignment there - it relies on
xfs_itruncate_extents_flags() to do the right thing w.r.t. extent
removal alignment w.r.t. the new EOF.
i.e. The xfs_setattr_size() code takes care of EOF block zeroing and
page cache removal so the user doesn't see old data beyond EOF,
whilst xfs_itruncate_extents_flags() is supposed to take care of the
extent removal and the details of that operation (e.g. alignment).
So we should roundup the unmap block to the alloc unit, correct? I have
my doubts about that, and thought that xfs_bunmapi_range() takes care of
any alignment handling.
Patch 10 also modifies xfs_can_free_eofblocks() to take alignment
into account for the post-eof block removal, but doesn't change
xfs_free_eofblocks() at all. i.e it also relies on
xfs_itruncate_extents_flags() to do the right thing for force
aligned inodes.
What state should the blocks post-EOF blocks be? A simple example of
partially truncating an alloc unit is:
$xfs_io -c "extsize" mnt/file
[16384] mnt/file
$xfs_bmap -vvp mnt/file
mnt/file:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS
0: [0..20479]: 192..20671 0 (192..20671) 20480 000000
$truncate -s 10461184 mnt/file # 10M - 6FSB
$xfs_bmap -vvp mnt/file
mnt/file:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS
0: [0..20431]: 192..20623 0 (192..20623) 20432 000000
1: [20432..20447]: 20624..20639 0 (20624..20639) 16 010000
FLAG Values:
0010000 Unwritten preallocated extent
Is that incorrect state?
In this case, we are removing post-eof speculative preallocation
that that has been allocated by delalloc conversion during
writeback. These post-eof extents will already be unwritten extents
because delalloc conversion uses unwritten extents to avoid
stale data exposure if we crash between allocation and the data
being written to the extents. Hence there should be no extents to
convert to unwritten in the majority of cases here.
The only case where we might get written extents beyond EOF is if
the file has been truncated down, but in that case we don't really
care because truncate should have already taken care of post-eof
extent alignment for us. xfs_can_free_eofblocks() will see this
extent alignment and so we'll skip xfs_free_eofblocks() in this case
altogether....
Hence xfs_free_eofblocks() should never need to convert a partial
unaligned extent range to unwritten when force-align is enabled
because the post-eof extents should already be unwritten. We also
want to leave the inode in the most optimal state for future
extension, which means we want the post-eof extent to be correctly
aligned.
Hence there are multiple reasons that xfs_itruncate_extents_flags()
should be aligning the post-EOF block it is starting the unmapping
at for force aligned allocation contexts. And in doing so, we remove
the weird corner case where we can have an unaligned extent state
boundary at EOF for atomic writes....
Yeah, I don't think that sub-alloc unit extent zeroing would help us
there, as we not be dealing with a new extent (for zeroing to occur).
Thanks,
John