Aneesh Kumar K.V wrote: > On Mon, Jan 04, 2010 at 05:08:55PM -0600, Eric Sandeen wrote: >> Eric Sandeen wrote: >>> Theodore Ts'o wrote: >>>> One of the things which has been annoying me for a while now is a >>>> hard-to-reproduce xfsqa failure in test #13 (fsstress), which causes the >>>> a test failure because the file system found to be inconsistent: >>>> >>>> Inode NNN, i_blocks is X, should be Y. >>> Interesting, this apparently has gotten much worse since 2.6.32. >>> >>> I wrote an xfstests reproducer, and couldn't hit it on .32; hit it right >>> off on 2.6.33-rc2. >>> >>> Probably should find out why ;) I'll go take a look. >> commit d21cd8f163ac44b15c465aab7306db931c606908 >> Author: Dmitry Monakhov <dmonakhov@xxxxxxxxxx> >> Date: Thu Dec 10 03:31:45 2009 +0000 >> >> ext4: Fix potential quota deadlock >> >> seems to be the culprit. >> >> (unfortunately this means that the error we saw before is something >> -else- to be fixed, yet) Anyway ... >> >> This is because we used to do this in ext4_mb_mark_diskspace_used() : >> >> /* >> * Now reduce the dirty block count also. Should not go negative >> */ >> if (!(ac->ac_flags & EXT4_MB_DELALLOC_RESERVED)) >> /* release all the reserved blocks if non delalloc */ >> percpu_counter_sub(&sbi->s_dirtyblocks_counter, >> reserv_blks); >> else { >> percpu_counter_sub(&sbi->s_dirtyblocks_counter, >> ac->ac_b_ex.fe_len); >> /* convert reserved quota blocks to real quota blocks */ >> vfs_dq_claim_block(ac->ac_inode, ac->ac_b_ex.fe_len); >> } >> >> i.e. the vfs_dq_claim_block was conditional based on >> EXT4_MB_DELALLOC_RESERVED... and the testcase did not go that way, >> because we had already preallocated the blocks. >> >> But with the above quota deadlock commit it's not unconditional >> anymore in ext4_da_update_reserve_space and we always call >> vfs_dq_claim_block which over-accounts. >> > > It is still conditional right ? We call ext4_da_update_reserve_space > only if EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE is set . That will > happen only in case of delayed allocation. I guess the problem is > same as what Ted stated. But i am not sure why we are able to reproduce > it much easily on 2.6.33-rc2. > Well, I'll take another look. But back out the above commit and I think you'll see that it changed things to make it 100% reproducible. -Eric -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html