From: Dave Chinner <dchinner@xxxxxxxxxx> Currently when we allocate at EOF, we set the initial target to the location of the inode. Then we call xfs_bmap_adjacent(), which sees that we are doing an EOF extension, so it moves the target to the last block of the previous extent. This may be in a different AG to the inode. When we then go to select the AG with the best free length, the AG at the target block might not have sufficient free space for the full allocation, so we may select a different AG. We then do an exact BNO allocation with the original target (EOF block), which reverts back to attempting an allocation in an AG that doesn't have sufficient contiguous free space available. This generally leads to allocation failure, and then we fall back to scanning the AGs for one that the allocation will succeed in. This scan also results in contended AGS being skipped, so we have no idea what AG we are going to end up allocating in. For sequential writes, this results in random extents being located in random places in non-target AGs. We want to guarantee that we can allocate in the AG that we have selected as having the "best contiguous free space" efficiently, so if we select a different AG, we should move the allocation target and skip the exact EOF allocation as we know it will not succeed. i.e. we should start with aligned allocation immediately, knowing it will likely succeed. Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> --- fs/xfs/libxfs/xfs_bmap.c | 32 ++++++++++++++++++++++++++------ 1 file changed, 26 insertions(+), 6 deletions(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index e14671414afb..e64ba7e2d13d 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -3252,8 +3252,18 @@ xfs_bmap_btalloc_select_lengths( if (error && error != -EAGAIN) break; error = 0; - if (*blen >= args->maxlen) + if (*blen >= args->maxlen) { + /* + * We are going to target a different AG than the + * incoming target, so we need to reset the target and + * skip exact EOF allocation attempts. + */ + if (agno != startag) { + ap->blkno = XFS_AGB_TO_FSB(mp, agno, 0); + ap->aeof = false; + } break; + } } if (pag) xfs_perag_rele(pag); @@ -3514,10 +3524,10 @@ xfs_bmap_btalloc_aligned( int error; /* - * If we failed an exact EOF allocation already, stripe - * alignment will have already been taken into account in - * args->minlen. Hence we only adjust minlen to try to preserve - * alignment if no slop has been reserved for alignment + * If we failed an exact EOF allocation already, stripe alignment will + * have already been taken into account in args->minlen. Hence we only + * adjust minlen to try to preserve alignment if no slop has been + * reserved for alignment */ if (args->minalignslop == 0) { if (blen > stripe_align && @@ -3653,6 +3663,16 @@ xfs_bmap_btalloc_best_length( ap->blkno = XFS_INO_TO_FSB(args->mp, ap->ip->i_ino); xfs_bmap_adjacent(ap); + /* + * We only use stripe alignment for EOF allocations. Hence if it isn't + * an EOF allocation, clear the stripe alignment. This allows us to + * skip exact block EOF allocation yet still do stripe aligned + * allocation if we select a different AG to the + * exact target block due to a lack of contiguous free space. + */ + if (!ap->aeof) + stripe_align = 0; + /* * Search for an allocation group with a single extent large enough for * the request. If one isn't found, then adjust the minimum allocation @@ -3675,7 +3695,7 @@ xfs_bmap_btalloc_best_length( } /* attempt aligned allocation for new EOF extents */ - if (ap->aeof) + if (stripe_align) error = xfs_bmap_btalloc_aligned(ap, args, blen, stripe_align, false); if (error || args->fsbno != NULLFSBLOCK) -- 2.40.1