Re: Transaction log reservation overrun when fallocating realtime file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 26, 2019 at 04:34:26PM -0800, Darrick J. Wong wrote:
> On Tue, Nov 26, 2019 at 12:27:14PM -0800, Omar Sandoval wrote:
> > Hello,
> > 
> > The following reproducer results in a transaction log overrun warning
> > for me:
> > 
> >   mkfs.xfs -f -r rtdev=/dev/vdc -d rtinherit=1 -m reflink=0 /dev/vdb
> >   mount -o rtdev=/dev/vdc /dev/vdb /mnt
> >   fallocate -l 4G /mnt/foo
> > 
> > I've attached the full dmesg output. My guess at the problem is that the
> > tr_write reservation used by xfs_alloc_file_space is not taking the realtime
> > bitmap and realtime summary inodes into account (inode numbers 129 and 130 on
> > this filesystem, which I do see in some of the log items). However, I'm not
> > familiar enough with the XFS transaction guts to confidently fix this. Can
> > someone please help me out?
> 
> Hmm...
> 
> /*
>  * In a write transaction we can allocate a maximum of 2
>  * extents.  This gives:
>  *    the inode getting the new extents: inode size
>  *    the inode's bmap btree: max depth * block size
>  *    the agfs of the ags from which the extents are allocated: 2 * sector
>  *    the superblock free block counter: sector size
>  *    the allocation btrees: 2 exts * 2 trees * (2 * max depth - 1) * block size
>  * And the bmap_finish transaction can free bmap blocks in a join:
>  *    the agfs of the ags containing the blocks: 2 * sector size
>  *    the agfls of the ags containing the blocks: 2 * sector size
>  *    the super block free block counter: sector size
>  *    the allocation btrees: 2 exts * 2 trees * (2 * max depth - 1) * block size
>  */
> STATIC uint
> xfs_calc_write_reservation(...);
> 
> So this means that the rt allocator can burn through at most ...
> 1 ext * 2 trees * (2 * maxdepth - 1) * blocksize
> ... worth of log reservation as part of setting bits in the rtbitmap and
> fiddling with the rtsummary information.
> 
> Instead, 4GB of 4k rt extents == 1 million rtexts to mark in use, which
> is 131072 bytes of rtbitmap to log, and *kaboom* there goes the 109K log
> reservation.

Ok, if that's the case, we still need to be able to allocate MAXEXTLEN in
a single transaction. That's 2^21 filesystem blocks, which at most
is 2^21 rtexts.

Hence I think we probably should have a separate rt-write
reservation that handles this case, and we use that for allocation
on rt devices rather than the bt-based allocation reservation.


> From: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> 
> xfs: cap realtime allocation length to something we can log
> 
> Omar Sandoval reported that a 4G fallocate on the realtime device causes
> filesystem shutdowns due to a log reservation overflow that happens when
> we log the rtbitmap updates.
> 
> The tr_write transaction reserves enough log reservation to handle a
> full splits of both free space btrees, so cap the rt allocation at that
> number of bits.
> 
> "The following reproducer results in a transaction log overrun warning
> for me:
> 
>     mkfs.xfs -f -r rtdev=/dev/vdc -d rtinherit=1 -m reflink=0 /dev/vdb
>     mount -o rtdev=/dev/vdc /dev/vdb /mnt
>     fallocate -l 4G /mnt/foo
> 
> Reported-by: Omar Sandoval <osandov@xxxxxxxxxxx>
> Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> ---
>  fs/xfs/xfs_bmap_util.c |   23 +++++++++++++++++++++++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> index 49d7b530c8f7..15c4e2790de3 100644
> --- a/fs/xfs/xfs_bmap_util.c
> +++ b/fs/xfs/xfs_bmap_util.c
> @@ -69,6 +69,26 @@ xfs_zero_extent(
>  }
>  
>  #ifdef CONFIG_XFS_RT
> +/*
> + * tr_write allows for one full split in the bnobt and cntbt to record the
> + * allocation, and that's how many bits of rtbitmap we can log to the
> + * transaction.  We leave one full block's worth of log space to handle the
> + * rtsummary update, though that's probably overkill.
> + */
> +static inline uint64_t
> +xfs_bmap_rtalloc_max(
> +	struct xfs_mount	*mp)
> +{
> +	uint64_t		max_rtbitmap;
> +
> +	max_rtbitmap = xfs_allocfree_log_count(mp, 1) - 1;
> +	max_rtbitmap *= XFS_FSB_TO_B(mp, 1);
> +	max_rtbitmap *= NBBY;
> +	max_rtbitmap *= mp->m_sb.sb_rextsize;

I can see how this works, but it strikes me as a bit of a hack. We
calculate the worst case reservations up front to avoid having to
play games like this in the code. Hence I think the correct thing to
do is fix the reservation to ensure we can do MAXEXTLEN allocations
without overruns...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux