On Tue, Feb 05, 2008 at 02:42:28PM +0100, Jan Kara wrote: > On Tue 05-02-08 17:53:42, Aneesh Kumar K.V wrote: > > > > How about the patch below. I did the below testing > > a) migrate a file > > b) run fs_inode fsstres fsx_linux. > > > > The intention was to find out whether the new locking is breaking any of > > the other expected hierarchy. It seems to fine. I didn't get any lockdep > > warning. > I think there's a problem in the patch. I don't think you can call > free_ind_block() while readers are accessing the file. That could make them > think the file contains holes when they aren't there (or even worse read > freed blocks or so). So you need to lock i_data_sem before this call (that > means you have to move journal_extend() as well). Actually, I don't quite > get why ext4_journal_extend() is called in that function. You can just > count with the 1 credit in ext4_ext_migrate() when you call > ext4_journal_extend() before calling ext4_ext_swap_inode_data(). Oh, > probably because free_ind_block() could extend the transaction (which they > don't do now as far as I can see). ext4_journal_extend is called to extend the journal by one credit to take care of writing the block containing inode. I moved the journal extend before taking i_data_sem lock. diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c index f97c993..dc6617a 100644 --- a/fs/ext4/migrate.c +++ b/fs/ext4/migrate.c @@ -302,10 +302,6 @@ static int ext4_ext_swap_inode_data(handle_t *handle, struct inode *inode, struct ext4_inode_info *ei = EXT4_I(inode); struct ext4_inode_info *tmp_ei = EXT4_I(tmp_inode); - retval = free_ind_block(handle, inode); - if (retval) - goto err_out; - /* * One credit accounted for writing the * i_data field of the original inode @@ -318,6 +314,10 @@ static int ext4_ext_swap_inode_data(handle_t *handle, struct inode *inode, } down_write(&EXT4_I(inode)->i_data_sem); + retval = free_ind_block(handle, inode); + if (retval) + goto err_out; + /* * We have the extent map build with the tmp inode. * Now copy the i_data across The above change also make sure we don't fail after we free the indirect blocks. > BTW: That freeing code should really take into account that it can modify > bitmaps in different block groups. It's even not that hard to do. Just > before each ext4_free_blocks() in free_ind_block() you check whether you > have still enough credits in the handle (use h_buffer_credits) and if not, > extend it by some amount. I have a FIXME at migrate.c:524 documenting exactly that. The difficult question was by how much we should extent the journal. ? But in reality we might have accumulated enough journal credits, I never really ran across a case where we are running out of the journal credit. > Maybe you could do some test like writing a big file with some data and then > while migration is running read it in a loop and compare that MD5SUM is the > same all the time. Also run some memory-pressure during this test so that > data blocks aren't cached for the whole time of the test... That should > reasonably stress the migration code. I have tested migrate by booting with mem= and doing parallel read/write and migrate. I didn't do the MDSUM compare. Will do that this time. Thanks for all the help with review. -aneesh > - To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html