Re: [PATCH v4 00/14] forcealign for xfs

Dave Chinner <david@xxxxxxxxxxxxx> · Fri, 6 Sep 2024 07:47:22 +1000

On Thu, Sep 05, 2024 at 11:15:41AM +0100, John Garry wrote:
> > >     - Does the unmapping of extents also only happens in extsize
> > >     chunks (with forcealign)?
> > 
> > Yes, via use of xfs_inode_alloc_unitsize() in the high level code
> > aligning the fsbno ranges to be unmapped.
> > 
> > Remember, force align requires both logical file offset and
> > physical block number to be correctly aligned, so unmap alignment
> > has to be set up correctly at file offset level before we even know
> > what extents underly the file range we need to unmap....
> > 
> > >       If the start or end of the extent which needs unmapping is
> > >       unaligned then we convert that extent to unwritten and skip,
> > >       is it? (__xfs_bunmapi())
> > 
> > The high level code should be aligning the start and end of the
> > file range to be removed via xfs_inode_alloc_unitsize().
> 
> Is that the case for something like truncate? There we just say what is the
> end block which we want to truncate to in
> xfs_itruncate_extents_flags(new_size)  ->
> xfs_bunmapi_range(XFS_B_TO_FSB(new_size)), and that may not be alloc unit
> aligned.

Ah, I thought we had that alignment in xfs_itruncate_extents_flags()
already, but if we don't then that's a bug that needs to be fixed.

We change the space reservation in xfs-setattr_size() for this case
(patch 9) but then don't do any alignment there - it relies on
xfs_itruncate_extents_flags() to do the right thing w.r.t. extent
removal alignment w.r.t. the new EOF.

i.e. The xfs_setattr_size() code takes care of EOF block zeroing and
page cache removal so the user doesn't see old data beyond EOF,
whilst xfs_itruncate_extents_flags() is supposed to take care of the
extent removal and the details of that operation (e.g. alignment).

Patch 10 also modifies xfs_can_free_eofblocks() to take alignment
into account for the post-eof block removal, but doesn't change
xfs_free_eofblocks() at all. i.e  it also relies on
xfs_itruncate_extents_flags() to do the right thing for force
aligned inodes.

In this case, we are removing post-eof speculative preallocation
that that has been allocated by delalloc conversion during
writeback.  These post-eof extents will already be unwritten extents
because delalloc conversion uses unwritten extents to avoid
stale data exposure if we crash between allocation and the data
being written to the extents. Hence there should be no extents to
convert to unwritten in the majority of cases here.

The only case where we might get written extents beyond EOF is if
the file has been truncated down, but in that case we don't really
care because truncate should have already taken care of post-eof
extent alignment for us. xfs_can_free_eofblocks() will see this
extent alignment and so we'll skip xfs_free_eofblocks() in this case
altogether....

Hence xfs_free_eofblocks() should never need to convert a partial
unaligned extent range to unwritten when force-align is enabled
because the post-eof extents should already be unwritten. We also
want to leave the inode in the most optimal state for future
extension, which means we want the post-eof extent to be correctly
aligned.

Hence there are multiple reasons that xfs_itruncate_extents_flags()
should be aligning the post-EOF block it is starting the unmapping
at for force aligned allocation contexts. And in doing so, we remove
the weird corner case where we can have an unaligned extent state
boundary at EOF for atomic writes....

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx