On Tue, Apr 12, 2011 at 04:53:28PM +1000, Dave Chinner wrote: > On Mon, Apr 11, 2011 at 08:55:33PM -0400, Lachlan McIlroy wrote: > > ----- Original Message ----- > > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > > > > > When a multilevel bmbt split occurs, we can be asked to allocate > > > blocks from an AG that has no space left available. In the case of > > > an extent just being allocated, the first bmbt block allocation sees > > > the firstblock parameter is set and does not set a minleft parameter > > > for the allocation. The allocation also does not set the total > > > number of blocks required by the allocation, either. ... > So, the patch below fixes the test 250 assert failure as well, and > to me seems much more likely as the root cause of the bug. FWIW, test 250 is showing up another three bugs, all unrelated to the bug it written to exercise. Two are mkfs bugs - the first being that mkfs is terminated due to freeing an invalid pointer. The second being that it is leaving behind a corrupted freespace btree: _check_xfs_filesystem: filesystem on /mnt/test/250.fs is inconsistent *** xfs_repair -n output *** Phase 1 - find and verify superblock... Phase 2 - using internal log - scan filesystem freespace and inode maps... invalid start block 4096 in record 0 of 4635043 btree block 1600/1 invalid start block 4096 in record 0 of 4635039 btree block 1600/2 - found root inode chunk Phase 3 - for each AG... .... xfs_db> convert agno 1600 agbno 1 fsb 0x640001 (6553601) xfs_db> fsb 0x640001 xfs_db> type bnobt xfs_db> p magic = 0x41425442 level = 0 numrecs = 1 leftsib = null rightsib = null recs[1] = [startblock,blockcount] 1:[4096,0] xfs_db> fsb 0x640002 xfs_db> type cntbt xfs_db> p magic = 0x41425443 level = 0 numrecs = 1 leftsib = null rightsib = null recs[1] = [startblock,blockcount] 1:[4096,0] xfs_db> So in AG 1600, there is a freespace record at block 4096 for length zero. Both are incorrect - the AG size is only 4096 blocks. Worth noting: xfs_db> agf 1600 xfs_db> p magicnum = 0x58414746 versionnum = 1 seqno = 1600 length = 4096 bnoroot = 1 cntroot = 2 bnolevel = 1 cntlevel = 1 flfirst = 0 fllast = 127 flcount = 0 freeblks = 0 longest = 0 btreeblks = 0 All the freelist blocks have been consumed, which should not happen - there should be 4 freelist blocks when the AG is empty. Looking a bit deeper, AG 1600 doesn't contain any data blocks - it contains the log. The log is allocated by mkfs, and is 4092 blocks in length. Ð just confirmed that the repair failure occurs on a freshly made FS, so this is definitely a mkfs bug that hasn't been noticed because the test hasn't been running to completion and checking the fs.... And the third bug is in repair: invalid start block 4096 in record 0 of 4635043 btree block 1600/1 invalid start block 4096 in record 0 of 4635039 btree block 1600/2 ^^^^^^^ These are supposed to say "bno" or "cnt", but a %d instead of a %s is incorrectly used in format string so it gives a wacky result. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs