Le mardi 21 octobre 2008 à 11:52 -0400, Theodore Tso a écrit : > Hi Frederic, > > Thanks for posting the update to your patch; I take it you've solved > the race condition? I haven't take a look at your updated patch yet, > but one thought that might make the potential race conditions much > simpler to analyze and prevent. Yes the race condition I found is solved with this patch. The issue happened when concurrent threads try to write to blocks in groups which had been added by the resizing. As I briefly explained in the patch, it was a matter of mballocator's datas which were wrongly initialized several times. > At the moment, the resize code, just before it calls to fix up the > mballoc data structures, calls ext4_free_blocks_sb() to mark the block > bitmap as being freed. That call should really go away, as > ext4_free_blocs_sb() is a remnant from the legacy block allocator, and > in fact does a lot of extra stuff that is not needed by mballoc(). > Perhaps the right answer is that we should have one function that > updates the block bitmap, as well as initializing the mballoc() data > structures, and it would *only* be called from the resize code. If OK, I will take a look at this function and see if I can update/clean it. > the concern is protecting against multiple resizers running at the > same time, then let's either (a) not call unlock_super() until the > mballoc data structures are initialized, or (b) create a new mutex > that is explicit for use by the online resize code. > In fact, I have never tested with multiple resizers til now because I never managed to run several instance of resize2fs concurrently: if a resize2fs is running, the second one simply fails with a "device busy" error. Frederic -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html