On Wed, Sep 04, 2024 at 11:44:29PM +0530, Ritesh Harjani wrote: > John Garry <john.g.garry@xxxxxxxxxx> writes: > > > This series is being spun off the block atomic writes for xfs > > series at [0]. > > > > That series got too big. > > > > The actual forcealign patches are roughly the same in this > > series. > > > > Why forcealign? In some scenarios to may be required to > > guarantee extent alignment and granularity. > > > > For example, for atomic writes, the maximum atomic write unit > > size would be limited at the extent alignment and granularity, > > guaranteeing that an atomic write would not span data present in > > multiple extents. > > > > forcealign may be useful as a performance tuning optimization in > > other scenarios. > > > > I decided not to support forcealign for RT devices here. > > Initially I thought that it would be quite simple of implement. > > However, I discovered through much testing and subsequent debug > > that this was not true, so I decided to defer support to > > later. > > > > Early development xfsprogs support is at: > > https://github.com/johnpgarry/xfsprogs-dev/commits/atomic-writes/ > > > > Hi John, > > Thanks for your continued work on atomic write. I went over the > XFS patch series and this is my understanding + some queries. > Could you please help with these. Hi Ritesh - to make it easier for everyone to read and reply to you emails, can you please word wrap the text at 72 columns? > 1. As I understand XFS untorn atomic write support is built on top > of FORCEALIGN feature (which this series is adding) which in turn > uses extsize hint feature underneath. Yes. > Now extsize hint mainly controls the alignment of both > "physical start" & "logical start" offset and extent length, > correct? Yes. > This is done using args->alignment for start aand > args->prod/mode variables for extent length. Correct? Yes. > - If say we are not able to allocate an aligned physical start? > Then since extsize is just a hint we go ahead with whatever > best available extent is right? No. The definition of "forced alignment" is that we guarantee aligned allocation to the extent size hint. i.e the extent size hint is not a hint anymore - it defines the alignment we are guaranteeing allocation will achieve. hence if we can't align the extent to the alignment provided, we fail the alignment. > - also extsize looks to be only providing allocation side of hints. (not de-allocation). Correct? No. See the use of xfs_inode_alloc_unitsize() in all the places where we free space. Forced alignment extends this function to return the extent size, not the block size. > 2. If say there is an append write i.e. the allocation is needed > to be done at EOF. Then we try for an exact bno (from eof block) > and aligned extent length, right? Yes. This works because the previous extent is exactly aligned, hence a contiguous allocation will continue to be correctly aligned due to the forced alignment constraints. > i.e. xfs_bmap_btalloc_filestreams() -> > xfs_bmap_btalloc_at_eof(ap, args); If it is not available then > we try for nearby bno xfs_alloc_vextent_near_bno(args, target) > and similar... yes, that's just the normal aligned allocation fallback path when exact allocation fails. > 3. It is the FORCEALIGN feature which _mandates_ both allocation > (by using extsize hint) and de-allocation to happen _only_ in > extsize chunks. > > i.e. forcealign mandates - > - the logical and physical start offset should be aligned as > per args->alignment > - extent length be aligned as per args->prod/mod. > If above two cannot be satisfied then return -ENOSPC. Yes. > > - Does the unmapping of extents also only happens in extsize > chunks (with forcealign)? Yes, via use of xfs_inode_alloc_unitsize() in the high level code aligning the fsbno ranges to be unmapped. Remember, force align requires both logical file offset and physical block number to be correctly aligned, so unmap alignment has to be set up correctly at file offset level before we even know what extents underly the file range we need to unmap.... > If the start or end of the extent which needs unmapping is > unaligned then we convert that extent to unwritten and skip, > is it? (__xfs_bunmapi()) The high level code should be aligning the start and end of the file range to be removed via xfs_inode_alloc_unitsize(). Hence the low level __xfs_bunmapi() code shouldn't ever be encountering unaligned unmaps on force-aligned inodes. -Dave. -- Dave Chinner david@xxxxxxxxxxxxx