On 4/25/12 6:23 PM, Dave Chinner wrote: > From: Dave Chinner <dchinner@xxxxxxxxxx> > > I've been seeing regular ASSERT failures in xfstests when running > fsstress based tests over the past month. xfs_getbmap() has been > failing this test: > > XFS: Assertion failed: ((iflags & BMV_IF_DELALLOC) != 0) || > (map[i].br_startblock != DELAYSTARTBLOCK), file: fs/xfs/xfs_bmap.c, > line: 5650 > > where it is encountering a delayed allocation extent after writing > all the dirty data to disk and then walking the extent map > atomically by holding the XFS_IOLOCK_SHARED to prevent new delayed > allocation extents from being created. > > Test 083 on a 512 byte block size filesystem was used to reproduce > the problem, because it only had a 5s run timeand would usually fail > every 3-4 runs. This test is exercising ENOSPC behaviour by running > fsstress on a nearly full filesystem. The following trace extract > shows the final few events on the inode that tripped the assert: > > xfs_ilock: flags ILOCK_EXCL caller xfs_setfilesize > xfs_setfilesize: isize 0x180000 disize 0x12d400 offset 0x17e200 count 7680 > > file size updated to 0x180000 by IO completion > > xfs_ilock: flags ILOCK_EXCL caller xfs_iomap_write_delay > xfs_iext_insert: state idx 3 offset 3072 block 4503599627239432 count 1 flag 0 caller xfs_bmap_add_extent_hole_delay > xfs_get_blocks_alloc: size 0x180000 offset 0x180000 count 512 type startoff 0xc00 startblock -1 blockcount 0x1 > xfs_ilock: flags ILOCK_EXCL caller __xfs_get_blocks > > delalloc write, adding a single block at offset 0x180000 > > xfs_delalloc_enospc: isize 0x180000 disize 0x180000 offset 0x180200 count 512 > > ENOSPC trying to allocate a dellalloc block at offset 0x180200 > > xfs_ilock: flags ILOCK_EXCL caller xfs_iomap_write_delay > xfs_get_blocks_alloc: size 0x180000 offset 0x180200 count 512 type startoff 0xc00 startblock -1 blockcount 0x2 > > And succeeding on retry after flushing dirty inodes. > > xfs_ilock: flags ILOCK_EXCL caller __xfs_get_blocks > xfs_delalloc_enospc: isize 0x180000 disize 0x180000 offset 0x180400 count 512 > > ENOSPC trying to allocate a dellalloc block at offset 0x180400 > > xfs_ilock: flags ILOCK_EXCL caller xfs_iomap_write_delay > xfs_delalloc_enospc: isize 0x180000 disize 0x180000 offset 0x180400 count 512 > > And failing the retry, giving a real ENOSPC error. > > xfs_ilock: flags ILOCK_EXCL caller xfs_vm_write_failed > ^^^^^^^^^^^^^^^^^^^ > The smoking gun - the write being failed and cleaning up delalloc > blocks beyond EOF allocated by the failed write. > > xfs_getattr: > xfs_ilock: flags IOLOCK_SHARED caller xfs_getbmap > xfs_ilock: flags ILOCK_SHARED caller xfs_ilock_map_shared > > And that's where we died almost immediately afterwards. > xfs_bmapi_read() found delalloc extent beyond current file in memory > file size. Some debug I added to xfs_getbmap() showed the state just > before the assert failure: > > ino 0x80e48: off 0xc00, fsb 0xffffffffffffffff, len 0x1, size 0x180000 > start_fsb 0x106, end_fsb 0x638 > ino flags 0x2 nex 0xd bmvcnt 0x555, len 0x3c58a6f23c0bf1, start 0xc00 > ext 0: off 0x1fc, fsb 0x24782, len 0x254 > ext 1: off 0x450, fsb 0x40851, len 0x30 > ext 2: off 0x480, fsb 0xd99, len 0x1b8 > ext 3: off 0x92f, fsb 0x4099a, len 0x3b > ext 4: off 0x96d, fsb 0x41844, len 0x98 > ext 5: off 0xbf1, fsb 0x408ab, len 0xf > > which shows that we found a single delalloc block beyond EOF (first > line of output) when we were returning the map for a length > somewhere around 10^16 bytes long (second line), and the on-disk > extents showed they didn't go past EOF (last lines). > > Further debug added to xfs_vm_write_failed() showed this happened > when punching out delalloc blocks beyond the end of the file after > the failed write: > > [ 132.606693] ino 0x80e48: vwf to 0x181000, sze 0x180000 > [ 132.609573] start_fsb 0xc01, end_fsb 0xc08 > > It punched the range 0xc01 -> 0xc08, but the range we really need to > punch is 0xc00 -> 0xc07 (8 blocks from 0xc00) as this testing was > run on a 512 byte block size filesystem (8 blocks per page). > the punch from is 0xc00. So end_fsb is correct, but start_fsb is > wrong as we punch from start_fsb for (end_fsb - start_fsb) blocks. > Hence we are not punching the delalloc block beyond EOF in the case. > > The fix is simple - it's a silly off-by-one mistake in calculating > the range. It's especially silly because the macro used to calculate > the start_fsb already takes into account the case where the inode > size is an exact multiple of the filesystem block size... > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> tl;dr, but I get the gist ;) I find these macros confusing. Reviewed-by: Eric Sandeen <sandeen@xxxxxxxxxx> Thanks for a patch & can grok & review ;) > --- > fs/xfs/xfs_aops.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c > index b66766a..64ed87a 100644 > --- a/fs/xfs/xfs_aops.c > +++ b/fs/xfs/xfs_aops.c > @@ -1431,7 +1431,7 @@ xfs_vm_write_failed( > * Check if there are any blocks that are outside of i_size > * that need to be trimmed back. > */ > - start_fsb = XFS_B_TO_FSB(ip->i_mount, inode->i_size) + 1; > + start_fsb = XFS_B_TO_FSB(ip->i_mount, inode->i_size); > end_fsb = XFS_B_TO_FSB(ip->i_mount, to); > if (end_fsb <= start_fsb) > return; _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs