Hi Dave, On Fri, Jan 14, 2011 at 10:15:44AM +1100, Dave Chinner wrote: > On Thu, Jan 13, 2011 at 01:39:28PM -0600, bpm@xxxxxxx wrote: > > Hi Dave, > > > > On Thu, Jan 13, 2011 at 11:52:07AM +1100, Dave Chinner wrote: > > > On Wed, Jan 12, 2011 at 01:42:28PM -0600, Ben Myers wrote: > > > > When filling in the middle of a previous delayed allocation, set > > > > br_startoff of the new delay extent to the right to NULLSTARTBLOCK > > > > so that it is ignored by xfs_bmap_extent_to_btree. This prevents > > > > a forced shutdown when that in-core extent is converted from delay > > > > to real and is found to be already in the btree. The value is > > > > overwritten below. > > > > You're right. That's not a very good description. I'll see about > > describing it here and then see about boiling it down for repost. > > > > > I'm not sure I understand what the problem is from your description. > > > What actually goes wrong in xfs_bmap_extent_to_btree()? > > > > While testing a related patch whose side effect is to exercise this code > > more often... > > > > I was hitting XFS_WANT_CORRUPTED_GOTOs in xfs_bmap_extent_delay_real in the > > LEFT_FILLING, RIGHT_FILLING cases where we are going to insert an extent > > into the btree and we look it up to make sure it isn't already there: > > > > From xfs_bmap_extent_delay_real: > > 911 case BMAP_LEFT_FILLING: > > 912 /* > > 913 * Filling in the first part of a previous delayed allocation. > > 914 * The left neighbor is not contiguous. > > 915 */ > > 916 trace_xfs_bmap_pre_update(ip, idx, state, _THIS_IP_); > > 917 xfs_bmbt_set_startoff(ep, new_endoff); > > 918 temp = PREV.br_blockcount - new->br_blockcount; > > 919 xfs_bmbt_set_blockcount(ep, temp); > > 920 xfs_iext_insert(ip, idx, 1, new, state); > > 921 ip->i_df.if_lastex = idx; > > 922 ip->i_d.di_nextents++; > > 923 if (cur == NULL) > > 924 rval = XFS_ILOG_CORE | XFS_ILOG_DEXT; > > 925 else { > > 926 rval = XFS_ILOG_CORE; > > 927 if ((error = xfs_bmbt_lookup_eq(cur, new->br_startoff, > > 928 new->br_startblock, new->br_blockcount, > > 929 &i))) > > 930 goto done; > > 931 XFS_WANT_CORRUPTED_GOTO(i == 0, done); > > > > Forced shutdown at 931. > > So it was allocating a different part of the delalloc extent > that triggered it. OK. > > <snip> > > > This loop in xfs_bmap_extents_to_btree copies extents from the ifork into > > the btree: > > > > 3242 /* > > 3243 * Fill in the child block. > > 3244 */ > > 3245 ablock = XFS_BUF_TO_BLOCK(abp); > > 3246 ablock->bb_magic = cpu_to_be32(XFS_BMAP_MAGIC); > > 3247 ablock->bb_level = 0; > > 3248 ablock->bb_u.l.bb_leftsib = cpu_to_be64(NULLDFSBNO); > > 3249 ablock->bb_u.l.bb_rightsib = cpu_to_be64(NULLDFSBNO); > > 3250 arp = XFS_BMBT_REC_ADDR(mp, ablock, 1); > > 3251 nextents = ifp->if_bytes / (uint)sizeof(xfs_bmbt_rec_t); > > 3252 for (cnt = i = 0; i < nextents; i++) { > > 3253 ep = xfs_iext_get_ext(ifp, i); > > 3254 if (!isnullstartblock(xfs_bmbt_get_startblock(ep))) { > > > > ^ note that it explicitly ignores delay allocation extents by testing for > > STARTBLOCKMASK in br_startblock. > > > > 3255 arp->l0 = cpu_to_be64(ep->l0); > > 3256 arp->l1 = cpu_to_be64(ep->l1); > > 3257 arp++; cnt++; > > 3258 } > > 3259 } > > > > So... when we converted from extents format to btree format we copied > > extent to the right into the tree because right.br_startblock = 0 > > which is not a nullstartblock. The right extent was actually > > destined to become delalloc in the ifork but that assignment > > happens only after we've converted to btree format. > > > > The fix works because we set br_startblock to something that includes > > STARTBLOCKMASK before inserting it into the ifork. This way the test at > > 3254 will fail for that extent and it won't be inserted into the btree. > > > > > I'm assuming > > > that it counts one too many real extents, and if so, shouldn't it > > > fire this assert long before you get to a shutdown situation? > > > > > > ASSERT(cnt == XFS_IFORK_NEXTENTS(ip, whichfork)); > > > > It should. > > Thanks for confirming that. Might I suggest that running a debug XFS > build is a good idea purely because it will catch problems like this > when they happen, rather than when you trip over the result later > on? Agreed. I'd prefer to be running debug xfs builds. > > > Also, do you have a test case that triggers this? If so, can it be > > > turned into a xfstests case? I like to have some idea of how the > > > problem was encountered and verified, because this code is complex > > > and easy to misunderstand... > > > > I don't have a test case. Building one might be possible by using extsize > > to get large delalloc allocations and modifying xfs_iomap_write_allocate to > > only allocate for the passed in offset and count instead of from passed in > > iomap... > > Hmmm, not really a generic test case :/ > > I was thinking that we might be able to use sync_file_range() to > write small ranges of pages inside a delalloc extent. However, it > does not limit wbc->nr_to_write() so I suspect it will write entire > delalloc extents at a time rather than small chunks. If I'm understanding xfs_iomap_write_allocate correctly, it seems that the code for writing into the middle of a delalloc extent isn't exercised very often. > > > > SGI-PV: 1013221 > > > > > > The following is mostly the notes I wrote to understand what your > > > patch does. I'm posting them so others don't need to go through the > > > same analysis to understand the patch. While you might have the > > > analysis in the above PV we can't see it at all, so it would be > > > appreciated if you could put a summary of the bug analysis and > > > test case in the commit message so we don't have to spend a couple of > > > hours just to work out what the patch does... > > > > Here are some traces from the PV: > > > > Dec 21 13:08:22 gtomds1 kernel: [ 2008.237772] xfs_iext_insert (ffff88018de2c3c0) br_startoff 9240 br_startblock 304078611 br_blockcount 512 br_state 0 > > Dec 21 13:08:22 gtomds1 kernel: [ 2008.237775] xfs_bmbt_set_all set w/ startblock == 0. br_startoff 9752, br_startblock 0 br_blockcount 18408, br_state 0 > > .... > > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238578] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 0: startoff 0 startblock NULLSTARTBLOCK(5) blockcount 5 flag 0 > > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238583] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 1: startoff 5 startblock 304069376[4:21fbb00] blockcount 512 flag 0 > > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238590] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 2: startoff 517 startblock NULLSTARTBLOCK(5) blockcount 2 flag 0 > > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238594] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 3: startoff 519 startblock 304069890[4:21fbd02] blockcount 512 flag 0 > > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238598] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 4: startoff 1031 startblock NULLSTARTBLOCK(5) blockcount 1 flag 0 > > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238602] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 5: startoff 1032 startblock 304070403[4:21fbf03] blockcount 1024 flag 0 > > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238606] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 6: startoff 2056 startblock NULLSTARTBLOCK(5) blockcount 3 flag 0 > > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238610] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 7: startoff 2059 startblock 304071430[4:21fc306] blockcount 1024 flag 0 > > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238614] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 8: startoff 3083 startblock NULLSTARTBLOCK(5) blockcount 3 flag 0 > > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238618] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 9: startoff 3086 startblock 304072457[4:21fc709] blockcount 2560 flag 0 > > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238623] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 10: startoff 5646 startblock NULLSTARTBLOCK(5) blockcount 1 flag 0 > > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238627] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 11: startoff 5647 startblock 304075018[4:21fd10a] blockcount 512 flag 0 > > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238631] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 12: startoff 6159 startblock NULLSTARTBLOCK(5) blockcount 4 flag 0 > > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238635] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 13: startoff 6163 startblock 304075534[4:21fd30e] blockcount 1024 flag 0 > > Dec 21 13:08:23 gtomds1 kernel: [ 2008.238639] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 14: startoff 7187 startblock NULLSTARTBLOCK(5) blockcount 2 flag 0 > > Dec 21 13:08:23 gtomds1 kernel: [ 2008.238643] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 15: startoff 7189 startblock 304076560[4:21fd710] blockcount 1024 flag 0 > > Dec 21 13:08:23 gtomds1 kernel: [ 2008.238647] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 16: startoff 8213 startblock NULLSTARTBLOCK(5) blockcount 2 flag 0 > > Dec 21 13:08:23 gtomds1 kernel: [ 2008.238651] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 17: startoff 8215 startblock 304077586[4:21fdb12] blockcount 1024 flag 0 > > Dec 21 13:08:23 gtomds1 kernel: [ 2008.238655] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 18: startoff 9239 startblock NULLSTARTBLOCK(79) blockcount 1 flag 0 > > Dec 21 13:08:23 gtomds1 kernel: [ 2008.238659] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 19: startoff 9240 startblock 304078611[4:21fdf13] blockcount 512 flag 0 > > Dec 21 13:08:23 gtomds1 kernel: [ 2008.238663] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 20: startoff 9752 startblock 0[0:0] blockcount 18408 flag 0 <---- we added an extent to the btree with startblock=0 > > .... > > Dec 21 13:08:42 gtomds1 kernel: [ 2009.735411] xfs_bmap_add_extent_delay_real new br_startoff 9752 br_startblock 304079636 br_blockcount1 <---- and it turns out to be the same one we crash on later! > > It's this pattern of alloc/delalloc extents that I'm having trouble > reproducing. The current TOT XFS code simply will not allocate the > middle part of a delalloc extent like this. What exactly is your test > doing to produce this sort of pattern? Also, delalloc behaviour has > changed over different versions, so can you tell me what version of > XFS you are running to cause this? IIRC my test case was dd... I think a bug(s) in the patch I was testing caused the strange allocation patterns. I'll get the test case for that bug posted shortly. Thanks, Ben _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs