From: Darrick J. Wong <djwong@xxxxxxxxxx> Currently, we don't let an internal log consume every last block in an AG. According to the comment, we're doing this to avoid tripping AGF verifiers if freeblks==0, but on a modern filesystem this isn't sufficient to avoid problems because we need to have enough space in the AG to allocate an aligned root inode chunk, if it should be the case that the log also ends up in AG 0: $ truncate -s 6366g /tmp/a ; mkfs.xfs -f /tmp/a -d agcount=3200 -l agnum=0 meta-data=/tmp/a isize=512 agcount=3200, agsize=521503 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=1 bigtime=0 inobtcount=0 data = bsize=4096 blocks=1668808704, imaxpct=5 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=521492, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 mkfs.xfs: root inode created in AG 1, not AG 0 Therefore, modify the maximum internal log size calculation to constrain the maximum internal log size so that the aligned inode chunk allocation will always succeed. Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> Reviewed-by: Christoph Hellwig <hch@xxxxxx> --- mkfs/xfs_mkfs.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 47 insertions(+) diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c index eb4d7fa9..0b1fb746 100644 --- a/mkfs/xfs_mkfs.c +++ b/mkfs/xfs_mkfs.c @@ -3270,6 +3270,49 @@ validate_log_size(uint64_t logblocks, int blocklog, int min_logblocks) } } +static void +adjust_ag0_internal_logblocks( + struct mkfs_params *cfg, + struct xfs_mount *mp, + int min_logblocks, + int *max_logblocks) +{ + int backoff = 0; + int ichunk_blocks; + + /* + * mkfs will trip over the write verifiers if the log is allocated in + * AG 0 and consumes enough space that we cannot allocate a non-sparse + * inode chunk for the root directory. The inode allocator requires + * that the AG have enough free space for the chunk itself plus enough + * to fix up the freelist with aligned blocks if we need to fill the + * allocation from the AGFL. + */ + ichunk_blocks = XFS_INODES_PER_CHUNK * cfg->inodesize >> cfg->blocklog; + backoff = ichunk_blocks * 4; + + /* + * We try to align inode allocations to the data device stripe unit, + * so ensure there's enough space to perform an aligned allocation. + * The inode geometry structure isn't set up yet, so compute this by + * hand. + */ + backoff = max(backoff, cfg->dsunit * 2); + + *max_logblocks -= backoff; + + /* If the specified log size is too big, complain. */ + if (cli_opt_set(&lopts, L_SIZE) && cfg->logblocks > *max_logblocks) { + fprintf(stderr, +_("internal log size %lld too large, must be less than %d\n"), + (long long)cfg->logblocks, + *max_logblocks); + usage(); + } + + cfg->logblocks = min(cfg->logblocks, *max_logblocks); +} + static void calculate_log_size( struct mkfs_params *cfg, @@ -3382,6 +3425,10 @@ _("log ag number %lld too large, must be less than %lld\n"), } else cfg->logagno = (xfs_agnumber_t)(sbp->sb_agcount / 2); + if (cfg->logagno == 0) + adjust_ag0_internal_logblocks(cfg, mp, min_logblocks, + &max_logblocks); + cfg->logstart = XFS_AGB_TO_FSB(mp, cfg->logagno, libxfs_prealloc_blocks(mp));