On Tue, Dec 10, 2024 at 12:54:39PM +0100, cem@xxxxxxxxxx wrote: > From: Carlos Maiolino <cmaiolino@xxxxxxxxxx> > > I tripped over an integer overflow when using a big journal size. > > Essentially I can reliably reproduce it using: > > mkfs.xfs -f -lsize=393216b -f -b size=4096 -m crc=1,reflink=1,rmapbt=1, \ > -i sparse=1 /dev/vdb2 > /dev/null > mount -o usrquota,grpquota,prjquota /dev/vdb2 /mnt > xfs_io -x -c 'shutdown -f' /mnt > umount /mnt > mount -o usrquota,grpquota,prjquota /dev/vdb2 /mnt > My apologies, I realized just now I posted the wrong reproducer here, the correct one is: mkfs.xfs -f -lsize=393216b -f -b size=4096 -m crc=1,reflink=1,rmapbt=1, -i sparse=1 /dev/vdb2 > /dev/null mount -o usrquota,grpquota,prjquota /dev/vdb2 /mnt xfs_io -x -c 'shutdown -f' /mnt umount /mnt mount -o ro,norecovery,usrquota,grpquota,prjquota /dev/vdb2 /mnt The lockup I mentioned happens on the norecovery mount. not on the regular mount as first I stated on the patch description. Sorry for the confusion > The last mount command get stuck on the following path: > > [<0>] xlog_grant_head_wait+0x5d/0x2a0 [xfs] > [<0>] xlog_grant_head_check+0x112/0x180 [xfs] > [<0>] xfs_log_reserve+0xe3/0x260 [xfs] > [<0>] xfs_trans_reserve+0x179/0x250 [xfs] > [<0>] xfs_trans_alloc+0x101/0x260 [xfs] > [<0>] xfs_sync_sb+0x3f/0x80 [xfs] > [<0>] xfs_qm_mount_quotas+0xe3/0x2f0 [xfs] > [<0>] xfs_mountfs+0x7ad/0xc20 [xfs] > [<0>] xfs_fs_fill_super+0x762/0xa50 [xfs] > [<0>] get_tree_bdev_flags+0x131/0x1d0 > [<0>] vfs_get_tree+0x26/0xd0 > [<0>] vfs_cmd_create+0x59/0xe0 > [<0>] __do_sys_fsconfig+0x4e3/0x6b0 > [<0>] do_syscall_64+0x82/0x160 > [<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e > > By investigating it a bit, I noticed that xlog_grant_head_check (called > from xfs_log_reserve), defines free_bytes as an integer, which in turn > is used to store the value from xlog_grant_space_left(). > xlog_grant_space_left() however, does return a uint64_t, and, giving a > big enough journal size, it can overflow the free_bytes in > xlog_grant_head_check(), resulting int the conditional: > > else if (free_bytes < *need_bytes) { > > in xlog_grant_head_check() to evaluate to true and cause xfsaild to try > to flush the log indefinitely, which seems to be causing xfs to get > stuck in xlog_grant_head_wait() indefinitely. > > I'm adding a fixes tag as a suggestion from hch, giving that after the > aforementioned patch, all xlog_grant_space_left() callers should store > the return value on a 64bit type. > > Fixes: c1220522ef40 ("xfs: grant heads track byte counts, not LSNs") > Signed-off-by: Carlos Maiolino <cmaiolino@xxxxxxxxxx> > --- > > I'd like to add a caveat here, because I don't properly understand the > journal code/mechanism yet. It does seem to me that it is feasible to > have the reserve grant head to go to a big number and indeed cause the > overflow, but I'm not completely sure that what I'm fixing is a real bug > or if just the symptom of something else (or maybe a bug that triggeded > another overflow bug :) > > > fs/xfs/xfs_log.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c > index 05daad8a8d34..a799821393b5 100644 > --- a/fs/xfs/xfs_log.c > +++ b/fs/xfs/xfs_log.c > @@ -222,7 +222,7 @@ STATIC bool > xlog_grant_head_wake( > struct xlog *log, > struct xlog_grant_head *head, > - int *free_bytes) > + uint64_t *free_bytes) > { > struct xlog_ticket *tic; > int need_bytes; > @@ -302,7 +302,7 @@ xlog_grant_head_check( > struct xlog_ticket *tic, > int *need_bytes) > { > - int free_bytes; > + uint64_t free_bytes; > int error = 0; > > ASSERT(!xlog_in_recovery(log)); > @@ -1088,7 +1088,7 @@ xfs_log_space_wake( > struct xfs_mount *mp) > { > struct xlog *log = mp->m_log; > - int free_bytes; > + uint64_t free_bytes; > > if (xlog_is_shutdown(log)) > return; > -- > 2.47.1 > >