On 4/7/16 7:50 PM, Christoph Hellwig wrote: > On Tue, Apr 05, 2016 at 04:05:07PM +1000, Dave Chinner wrote: >> From: Dave Chinner <dchinner@xxxxxxxxxx> >> >> Commit 96f859d ("libxfs: pack the agfl header structure so >> XFS_AGFL_SIZE is correct") allowed the freelist to use the empty >> slot at the end of the freelist on 64 bit systems that was not >> being used due to sizeof() rounding up the structure size. >> >> This has caused versions of xfs_repair prior to 4.5.0 (which also >> has the fix) to report this as a corruption once the filesystem has >> been grown. Older kernels can also have problems (seen from a whacky >> container/vm management environment) mounting filesystems grown on a >> system with a newer kernel than the vm/container it is deployed on. >> >> To avoid this problem, change the initial free list indexes not to >> wrap across the end of the AGFL, hence avoiding the initialisation >> of agf_fllast to the last index in the AGFL. > > I have to admit that it's been a while that I looked at the AGFL > code, but I simply don't understand what's happening in this patch. > Diff slightly reorder: > >> - agf->agf_flfirst = 0; >> + agf->agf_flfirst = cpu_to_be32(1); > > So flfirst moves from 0 to 1. > >> - agf->agf_fllast = cpu_to_be32(XFS_AGFL_SIZE(mp) - 1); >> + agf->agf_fllast = 0; > > And last from size - 1 to 0. In my naive reading this introduces > wrapping and doesn't remove it. What do I miss? I'm confused by this too. I think this fixes it because regardless of XFS_AGFL_SIZE under any kernel, when we follow the circular list we'll wrap around at the "right" limit, if we start out wrapped as above, rather than potentially filling in a number for last which doesn't match the running code? Anyway, it does fix the testcase of "mkfs with old xfsprogs; grow under new kernel; repair with old progs" which used to complain about i.e. "fllast 118 in agf 94 too large (max = 118)" A growfs under a new kernel, and a mount under an old kernel showed the same problems; this should fix that as well. We seem to have a few problems introduced by the AGFL header packing; we have checks (in xfs_agf_verify(), for example, and xfs_repair's verify_set_agf()) which depend on the size of this structure. If the size moves in the "wrong" way the checks fire off as corruption. It seems to me that now, mismatches between userspace/kernelspace versions will cause these size checks to fail; that seems much more common (and worse) than the original problem of migrating a filesystem between 32 and 64 bit machines. I'm trying to convince myself that we don't have a lot more of these lurking with all the combinations of old/new kernels & old/new userspace, or filesystems migrated between old/new kernels, etc. This patch is ok for initialization but isn't it still quite possible to end up with an fllast set at runtime which is outside the valid range for older userspace or kernel code? -Eric _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs