Re: [PATCH] xfs: fix livelock in delayed allocation at ENOSPC

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 21, 2023 at 08:53:00PM -0700, Darrick J. Wong wrote:
> On Sat, Apr 22, 2023 at 08:24:40AM +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > 
> > On a filesystem with a non-zero stripe unit and a large sequential
> > write, delayed allocation will set a minimum allocation length of
> > the stripe unit. If allocation fails because there are no extents
> > long enough for an aligned minlen allocation, it is supposed to
> > fall back to unaligned allocation which allows single block extents
> > to be allocated.
> > 
> > When the allocator code was rewritting in the 6.3 cycle, this
> > fallback was broken - the old code used args->fsbno as the both the
> > allocation target and the allocation result, the new code passes the
> > target as a separate parameter. The conversion didn't handle the
> > aligned->unaligned fallback path correctly - it reset args->fsbno to
> > the target fsbno on failure which broke allocation failure detection
> > in the high level code and so it never fell back to unaligned
> > allocations.
> > 
> > This resulted in a loop in writeback trying to allocate an aligned
> > block, getting a false positive success, trying to insert the result
> > in the BMBT. This did nothing because the extent already was in the
> > BMBT (merge results in an unchanged extent) and so it returned the
> > prior extent to the conversion code as the current iomap.
> > 
> > Because the iomap returned didn't cover the offset we tried to map,
> > xfs_convert_blocks() then retries the allocation, which fails in the
> > same way and now we have a livelock.
> > 
> > Reported-by: Brian Foster <bfoster@xxxxxxxxxx>
> > Fixes: 85843327094f ("xfs: factor xfs_bmap_btalloc()")
> > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> 
> Will give this one a spin through the test system over the weekend.
> 
> In the meantime, can one of you come up with a reproducer?  From the
> description, it doesn't sound like that should be too hard -- mount with
> no stripe unit set, fragment the free space, mount with a stripe unit
> set, then run the fs out of space?

No need.

# ./run_check --run-opts "-s xfs_align -g enospc"
Running: MOUNT_OPTIONS= ./check -R xunit -b -s xfs_align -g enospc
SECTION       -- xfs_align
FSTYP         -- xfs (debug)
PLATFORM      -- Linux/x86_64 test2 6.3.0-rc6-dgc+ #1779 SMP PREEMPT_DYNAMIC Fri Apr 14 11:24:18 AEST 2023
MKFS_OPTIONS  -- -f -m rmapbt=1 -dsu=128k,sw=2 /dev/vdb
MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/vdb /mnt/scratch

generic/015 1s ... 

Hangs immediately on the first ENOSPC test.

IOWs, up to this point, no ENOSPC testing had been done on stripe
aligned filesystems. A hole in the (ever expanding) test matrix we
need to run...

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux