[PATCH 2/9] xfs: contiguous EOF allocation across AGs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Dave Chinner <dchinner@xxxxxxxxxx>

Currently when we allocate at EOF, we set the initial target to the
location of the inode. Then we call xfs_bmap_adjacent(), which sees
that we are doing an EOF extension, so it moves the target to the
last block of the previous extent. This may be in a different AG to
the inode.

When we then go to select the AG with the best free length, the AG
at the target block might not have sufficient free space for the
full allocation, so we may select a different AG. We then do an
exact BNO allocation with the original target (EOF block), which
reverts back to attempting an allocation in an AG that doesn't have
sufficient contiguous free space available.

This generally leads to allocation failure, and then we fall back to
scanning the AGs for one that the allocation will succeed in. This
scan also results in contended AGS being skipped, so we have no idea
what AG we are going to end up allocating in. For sequential writes,
this results in random extents being located in random places in
non-target AGs.

We want to guarantee that we can allocate in the AG that we have
selected as having the "best contiguous free space" efficiently,
so if we select a different AG, we should move the allocation target
and skip the exact EOF allocation as we know it will not succeed.
i.e. we should start with aligned allocation immediately, knowing it
will likely succeed.

Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
---
 fs/xfs/libxfs/xfs_bmap.c | 32 ++++++++++++++++++++++++++------
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index e14671414afb..e64ba7e2d13d 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -3252,8 +3252,18 @@ xfs_bmap_btalloc_select_lengths(
 		if (error && error != -EAGAIN)
 			break;
 		error = 0;
-		if (*blen >= args->maxlen)
+		if (*blen >= args->maxlen) {
+			/*
+			 * We are going to target a different AG than the
+			 * incoming target, so we need to reset the target and
+			 * skip exact EOF allocation attempts.
+			 */
+			if (agno != startag) {
+				ap->blkno = XFS_AGB_TO_FSB(mp, agno, 0);
+				ap->aeof = false;
+			}
 			break;
+		}
 	}
 	if (pag)
 		xfs_perag_rele(pag);
@@ -3514,10 +3524,10 @@ xfs_bmap_btalloc_aligned(
 	int			error;
 
 	/*
-	 * If we failed an exact EOF allocation already, stripe
-	 * alignment will have already been taken into account in
-	 * args->minlen. Hence we only adjust minlen to try to preserve
-	 * alignment if no slop has been reserved for alignment
+	 * If we failed an exact EOF allocation already, stripe alignment will
+	 * have already been taken into account in args->minlen. Hence we only
+	 * adjust minlen to try to preserve alignment if no slop has been
+	 * reserved for alignment
 	 */
 	if (args->minalignslop == 0) {
 		if (blen > stripe_align &&
@@ -3653,6 +3663,16 @@ xfs_bmap_btalloc_best_length(
 	ap->blkno = XFS_INO_TO_FSB(args->mp, ap->ip->i_ino);
 	xfs_bmap_adjacent(ap);
 
+	/*
+	 * We only use stripe alignment for EOF allocations. Hence if it isn't
+	 * an EOF allocation, clear the stripe alignment. This allows us to
+	 * skip exact block EOF allocation yet still do stripe aligned
+	 * allocation if we select a different AG to the
+	 * exact target block due to a lack of contiguous free space.
+	 */
+	if (!ap->aeof)
+		stripe_align = 0;
+
 	/*
 	 * Search for an allocation group with a single extent large enough for
 	 * the request.  If one isn't found, then adjust the minimum allocation
@@ -3675,7 +3695,7 @@ xfs_bmap_btalloc_best_length(
 	}
 
 	/* attempt aligned allocation for new EOF extents */
-	if (ap->aeof)
+	if (stripe_align)
 		error = xfs_bmap_btalloc_aligned(ap, args, blen, stripe_align,
 				false);
 	if (error || args->fsbno != NULLFSBLOCK)
-- 
2.40.1




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux