Re: [PATCH 1/4] xfs: Don't wrap growfs AGFL indexes

"Darrick J. Wong" <darrick.wong@xxxxxxxxxx> · Tue, 19 Apr 2016 13:51:30 -0700 (PDT)

On Tue, Apr 19, 2016 at 04:13:25PM -0400, Eric Sandeen wrote:
> 
> 
> On 4/7/16 7:50 PM, Christoph Hellwig wrote:
> > On Tue, Apr 05, 2016 at 04:05:07PM +1000, Dave Chinner wrote:
> >> From: Dave Chinner <dchinner@xxxxxxxxxx>
> >>
> >> Commit 96f859d ("libxfs: pack the agfl header structure so
> >> XFS_AGFL_SIZE is correct") allowed the freelist to use the empty
> >> slot at the end of the freelist on 64 bit systems that was not
> >> being used due to sizeof() rounding up the structure size.
> >>
> >> This has caused versions of xfs_repair prior to 4.5.0 (which also
> >> has the fix) to report this as a corruption once the filesystem has
> >> been grown. Older kernels can also have problems (seen from a whacky
> >> container/vm management environment) mounting filesystems grown on a
> >> system with a newer kernel than the vm/container it is deployed on.
> >>
> >> To avoid this problem, change the initial free list indexes not to
> >> wrap across the end of the AGFL, hence avoiding the initialisation
> >> of agf_fllast to the last index in the AGFL.
> > 
> > I have to admit that it's been a while that I looked at the AGFL
> > code, but I simply don't understand what's happening in this patch.
> > Diff slightly reorder:
> > 
> >> -		agf->agf_flfirst = 0;
> >> +		agf->agf_flfirst = cpu_to_be32(1);
> > 
> > So flfirst moves from 0 to 1.
> > 
> >> -		agf->agf_fllast = cpu_to_be32(XFS_AGFL_SIZE(mp) - 1);
> >> +		agf->agf_fllast = 0;
> > 
> > And last from size - 1 to 0.  In my naive reading this introduces
> > wrapping and doesn't remove it.  What do I miss?
> 
> I'm confused by this too.  I think this fixes it because regardless
> of XFS_AGFL_SIZE under any kernel, when we follow the circular list
> we'll wrap around at the "right" limit, if we start out wrapped
> as above, rather than potentially filling in a number for last which
> doesn't match the running code?
> 
> Anyway, it does fix the testcase of "mkfs with
> old xfsprogs; grow under new kernel; repair with old progs" which
> used to complain about i.e. "fllast 118 in agf 94 too large (max = 118)"
> A growfs under a new kernel, and a mount under an old kernel
> showed the same problems; this should fix that as well.
> 
> We seem to have a few problems introduced
> by the AGFL header packing; we have checks (in xfs_agf_verify(), for example,
> and xfs_repair's verify_set_agf()) which depend on the size of this structure.
> If the size moves in the "wrong" way the checks fire off as corruption.

We could also pad struct xfs_agfl so that the size is always 40 bytes, like it
used to be on 64-bit; then always write NULLAGBLOCK to the slot at the end of
the sector, which should be past XFS_AGFL_SIZE().  This means 32-bit will be
broken if you run a new xfsprogs with an old kernel, but all the complaints
from the (hopefully larger?) numbers of 64-bit xfs users will go away.

(OFC now there's all the people who already pulled in the first agfl fix...)

Hurghahgrhrghmfh. Messy. <sigh>

--D

> 
> It seems to me that now, mismatches between userspace/kernelspace versions
> will cause these size checks to fail; that seems much more common (and worse)
> than the original problem of migrating a filesystem between 32 and 64 bit
> machines.
> 
> I'm trying to convince myself that we don't have a lot more of these lurking
> with all the combinations of old/new kernels & old/new userspace, or filesystems
> migrated between old/new kernels, etc.  This patch is ok for initialization but
> isn't it still quite possible to end up with an fllast set at runtime
> which is outside the valid range for older userspace or kernel code?
> 
> -Eric
> 
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs