On Thu, Apr 16, 2015 at 01:32:38PM -0400, Brian Foster wrote: > On Thu, Apr 16, 2015 at 03:00:50PM +1000, Dave Chinner wrote: > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > > > This results in BMBT corruption, as seen by this test: > > > > # mkfs.xfs -f -d size=40051712b,agcount=4 /dev/vdc > > .... > > # mount /dev/vdc /mnt/scratch > > # xfs_io -ft -c "extsize 16m" -c "falloc 0 30g" -c "bmap -vp" /mnt/scratch/foo > > > > which results in this failure on a debug kernel: > > > > XFS: Assertion failed: (blockcount & xfs_mask64hi(64-BMBT_BLOCKCOUNT_BITLEN)) == 0, file: fs/xfs/libxfs/xfs_bmap_btree.c, line: 211 > > .... .... > > Looks fine from the perspective of applying pre-existing logic to a > separate codepath, but... That was my thought.... > > + * Calculate the maximum extent length we can ask to allocate after taking into > > + * account the on-disk size limitations, the extent size hints and the size > > + * being requested. We have to deal with the extent size hint here because the > > + * allocation will attempt alignment and hence grow the length outwards by up to > > + * @extsz on either side. > > + */ > > +static inline xfs_extlen_t > > +xfs_bmapi_max_extlen( > > + struct xfs_inode *ip, > > + xfs_extlen_t length) > > +{ > > + xfs_extlen_t extsz = xfs_get_extsz_hint(ip); > > + xfs_extlen_t max_length = MAXEXTLEN; > > + > > + if (extsz) > > + max_length -= 2 * extsz - 1; > > This can underflow or cause other issues if set to just the right value > (with smaller block sizes such that length can be trimmed to 0): But I assumed the existing code was correct for this context. My bad. :/ > $ mkfs.xfs -f -bsize=1k <dev> > $ mount <dev> /mnt > $ xfs_io -f -c "extsize 1g" -c "pwrite 0 4k" -c fsync /mnt/file > pwrite64: No space left on device Yup, because it 2^21 = 2G, and extsize = 1g puts max_length < 0. Ok. So, the problem is that it is overestimating the amount of space that alignment will need, and that alignment cannot be guaranteed for extsz hints of over (MAXEXTLEN / 2) in size. i.e. given an alignment (A[0-2]) and an extent (E[01]): A0 A1 A2 +-------------------+-------------------+ +ooo+ E0 E1 The problem is that the alignment done by xfs_bmap_extsize_align() only extends outwards (i.e. increases extent size). Hence E0 gets rounded down to A0-A2, and E1 gets extended to A2, which means we are adding almost 2 entire extent size hints to the allocation. That's where the reduction in length by two extsz values came from. Now, for delayed allocation, this is just fine, because real allocation will break this delalloc extent up into two separate extents, and underflow wouldn't be noticed as delalloc extents are not physically limited to MAXEXTLEN and so nothing would have broken. Still, it's not the intended behaviour. I'm not sure what the solution is yet - the fundamental problem here is the outwards alignment of both ends of the extent, and this MAXEXTLEN twiddling is just an encoding of that behaviour. I need to spend some time looking at xfs_bmap_extsize_align() and determining if there is something we can do differently here. > (Both that and the original reproducer might make a good xfstests test, > btw...) Yeah, I think I mentioned that on IRC to sandeen when I wrote the first fix. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs