On 07/23/2012 12:18 AM, Theodore Ts'o wrote: > On Mon, Apr 02, 2012 at 11:14:20AM -0400, Brian Foster wrote: >> From: Brian Foster <bfoster@xxxxxxxxxx> >> >> If we call ext4_calc_metadata_amount() and then fail to claim >> the space, a subsequent successful request to a block covered >> by the same ext2 indirect block will set md_needed to 0, fail >> to update the associated counters and result in a delayed md >> block allocation without a reservation. Decrement >> i_da_metadata_calc_len on failure to ensure the next >> reservation sets md_needed correctly. >> >> Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx> > > Hi Brian, > > This patch was not quite right. i_data_metadata_calc_len is not the > only side effect of ext4_calc_metadata(). We need to undo the side > effects not only when exit with an error, but also if we yield and > then retry the claim. Also, in the extent case we don't always modify > i_data_metadata_calc_len, so just decrementing isn't necessarily going > to do the right thing. > > So this is the patch which I am testing.... > Hi Ted, Ok, your patch makes sense. Thanks for the explanation. FWIW, I ran it through my reproducer a couple times and it passes. Brian > - Ted > > commit 19ec0f1fe139a642f688177ffd2f91a1c09f6bc0 > Author: Theodore Ts'o <tytso@xxxxxxx> > Date: Mon Jul 23 00:00:20 2012 -0400 > > ext4: undo ext4_calc_metadata_amount if we fail to claim space > > The function ext4_calc_metadata_amount() has side effects, although > it's not obvious from its function name. So if we fail to claim > space, regardless of whether we retry to claim the space again, or > return an error, we need to undo these side effects. > > Otherwise we can end up incorrectly calculating the number of metadata > blocks needed for the operation, which was responsible for an xfstests > failure for test #271 when using an ext2 file system with delalloc > enabled. > > Reported-by: Brian Foster <bfoster@xxxxxxxxxx> > Signed-off-by: "Theodore Ts'o" <tytso@xxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 25f809d..89b59cb 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -1182,6 +1182,17 @@ static int ext4_da_reserve_space(struct inode *inode, ext4_lblk_t lblock) > struct ext4_inode_info *ei = EXT4_I(inode); > unsigned int md_needed; > int ret; > + ext4_lblk_t save_last_lblock; > + int save_len; > + > + /* > + * We will charge metadata quota at writeout time; this saves > + * us from metadata over-estimation, though we may go over by > + * a small amount in the end. Here we just reserve for data. > + */ > + ret = dquot_reserve_block(inode, EXT4_C2B(sbi, 1)); > + if (ret) > + return ret; > > /* > * recalculate the amount of metadata blocks to reserve > @@ -1190,32 +1201,31 @@ static int ext4_da_reserve_space(struct inode *inode, ext4_lblk_t lblock) > */ > repeat: > spin_lock(&ei->i_block_reservation_lock); > + /* > + * ext4_calc_metadata_amount() has side effects, which we have > + * to be prepared undo if we fail to claim space. > + */ > + save_len = ei->i_da_metadata_calc_len; > + save_last_lblock = ei->i_da_metadata_calc_last_lblock; > md_needed = EXT4_NUM_B2C(sbi, > ext4_calc_metadata_amount(inode, lblock)); > trace_ext4_da_reserve_space(inode, md_needed); > - spin_unlock(&ei->i_block_reservation_lock); > > /* > - * We will charge metadata quota at writeout time; this saves > - * us from metadata over-estimation, though we may go over by > - * a small amount in the end. Here we just reserve for data. > - */ > - ret = dquot_reserve_block(inode, EXT4_C2B(sbi, 1)); > - if (ret) > - return ret; > - /* > * We do still charge estimated metadata to the sb though; > * we cannot afford to run out of free blocks. > */ > if (ext4_claim_free_clusters(sbi, md_needed + 1, 0)) { > - dquot_release_reservation_block(inode, EXT4_C2B(sbi, 1)); > + ei->i_da_metadata_calc_len = save_len; > + ei->i_da_metadata_calc_last_lblock = save_last_lblock; > + spin_unlock(&ei->i_block_reservation_lock); > if (ext4_should_retry_alloc(inode->i_sb, &retries)) { > yield(); > goto repeat; > } > + dquot_release_reservation_block(inode, EXT4_C2B(sbi, 1)); > return -ENOSPC; > } > - spin_lock(&ei->i_block_reservation_lock); > ei->i_reserved_data_blocks++; > ei->i_reserved_meta_blocks += md_needed; > spin_unlock(&ei->i_block_reservation_lock); > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html