On Tue 13-08-24 20:34:46, Zhang Yi wrote: > From: Zhang Yi <yi.zhang@xxxxxxxxxx> > > Now that we update data reserved space for delalloc after allocating > new blocks in ext4_{ind|ext}_map_blocks(), and if bigalloc feature is > enabled, we also need to query the extents_status tree to calculate the > exact reserved clusters. This is complicated now and it appears that > it's better to do this job in ext4_es_insert_extent(), because > __es_remove_extent() have already count delalloc blocks when removing > delalloc extents and __revise_pending() return new adding pending count, > we could update the reserved blocks easily in ext4_es_insert_extent(). > > We direct reduce the reserved cluster count when replacing a delalloc > extent. However, thers are two special cases need to concern about the > quota claiming when doing direct block allocation (e.g. from fallocate). > > A), > fallocate a range that covers a delalloc extent but start with > non-delayed allocated blocks, e.g. a hole. > > hhhhhhh+ddddddd+ddddddd > ^^^^^^^^^^^^^^^^^^^^^^^ fallocate this range > > Current ext4_map_blocks() can't always trim the extent since it may > release i_data_sem before calling ext4_map_create_blocks() and raced by > another delayed allocation. Hence the EXT4_GET_BLOCKS_DELALLOC_RESERVE > may not set even when we are replacing a delalloc extent, without this > flag set, the quota has already been claimed by ext4_mb_new_blocks(), so > we should release the quota reservations instead of claim them again. > > B), > bigalloc feature is enabled, fallocate a range that contains non-delayed > allocated blocks. > > |< one cluster >| > hhhhhhh+hhhhhhh+hhhhhhh+ddddddd > ^^^^^^^ fallocate this range > > This case is similar to above case, the EXT4_GET_BLOCKS_DELALLOC_RESERVE > flag is also not set. > > Hence we should release the quota reservations if we replace a delalloc > extent but without EXT4_GET_BLOCKS_DELALLOC_RESERVE set. > > Signed-off-by: Zhang Yi <yi.zhang@xxxxxxxxxx> I really like how simple this ended being! Feel free to add: Reviewed-by: Jan Kara <jack@xxxxxxx> Honza > --- > fs/ext4/extents.c | 37 ------------------------------------- > fs/ext4/extents_status.c | 25 ++++++++++++++++++++++++- > fs/ext4/indirect.c | 7 ------- > 3 files changed, 24 insertions(+), 45 deletions(-) > > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c > index 671dacd7c873..db8f9d79477c 100644 > --- a/fs/ext4/extents.c > +++ b/fs/ext4/extents.c > @@ -4357,43 +4357,6 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode, > goto out; > } > > - /* > - * Reduce the reserved cluster count to reflect successful deferred > - * allocation of delayed allocated clusters or direct allocation of > - * clusters discovered to be delayed allocated. Once allocated, a > - * cluster is not included in the reserved count. > - */ > - if (test_opt(inode->i_sb, DELALLOC) && allocated_clusters) { > - if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE) { > - /* > - * When allocating delayed allocated clusters, simply > - * reduce the reserved cluster count and claim quota > - */ > - ext4_da_update_reserve_space(inode, allocated_clusters, > - 1); > - } else { > - ext4_lblk_t lblk, len; > - unsigned int n; > - > - /* > - * When allocating non-delayed allocated clusters > - * (from fallocate, filemap, DIO, or clusters > - * allocated when delalloc has been disabled by > - * ext4_nonda_switch), reduce the reserved cluster > - * count by the number of allocated clusters that > - * have previously been delayed allocated. Quota > - * has been claimed by ext4_mb_new_blocks() above, > - * so release the quota reservations made for any > - * previously delayed allocated clusters. > - */ > - lblk = EXT4_LBLK_CMASK(sbi, map->m_lblk); > - len = allocated_clusters << sbi->s_cluster_bits; > - n = ext4_es_delayed_clu(inode, lblk, len); > - if (n > 0) > - ext4_da_update_reserve_space(inode, (int) n, 0); > - } > - } > - > /* > * Cache the extent and update transaction to commit on fdatasync only > * when it is _not_ an unwritten extent. > diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c > index 0580bc4bc762..41adf0d69959 100644 > --- a/fs/ext4/extents_status.c > +++ b/fs/ext4/extents_status.c > @@ -853,6 +853,7 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk, > struct extent_status newes; > ext4_lblk_t end = lblk + len - 1; > int err1 = 0, err2 = 0, err3 = 0; > + int resv_used = 0, pending = 0; > struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); > struct extent_status *es1 = NULL; > struct extent_status *es2 = NULL; > @@ -891,7 +892,7 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk, > pr = __alloc_pending(true); > write_lock(&EXT4_I(inode)->i_es_lock); > > - err1 = __es_remove_extent(inode, lblk, end, NULL, es1); > + err1 = __es_remove_extent(inode, lblk, end, &resv_used, es1); > if (err1 != 0) > goto error; > /* Free preallocated extent if it didn't get used. */ > @@ -921,9 +922,31 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk, > __free_pending(pr); > pr = NULL; > } > + pending = err3; > } > error: > write_unlock(&EXT4_I(inode)->i_es_lock); > + /* > + * Reduce the reserved cluster count to reflect successful deferred > + * allocation of delayed allocated clusters or direct allocation of > + * clusters discovered to be delayed allocated. Once allocated, a > + * cluster is not included in the reserved count. > + * > + * When direct allocating (from fallocate, filemap, DIO, or clusters > + * allocated when delalloc has been disabled by ext4_nonda_switch()) > + * an extent either 1) contains delayed blocks but start with > + * non-delayed allocated blocks (e.g. hole) or 2) contains non-delayed > + * allocated blocks which belong to delayed allocated clusters when > + * bigalloc feature is enabled, quota has already been claimed by > + * ext4_mb_new_blocks(), so release the quota reservations made for > + * any previously delayed allocated clusters instead of claim them > + * again. > + */ > + resv_used += pending; > + if (resv_used) > + ext4_da_update_reserve_space(inode, resv_used, > + flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE); > + > if (err1 || err2 || err3 < 0) > goto retry; > > diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c > index d8ca7f64f952..7404f0935c90 100644 > --- a/fs/ext4/indirect.c > +++ b/fs/ext4/indirect.c > @@ -652,13 +652,6 @@ int ext4_ind_map_blocks(handle_t *handle, struct inode *inode, > ext4_update_inode_fsync_trans(handle, inode, 1); > count = ar.len; > > - /* > - * Update reserved blocks/metadata blocks after successful block > - * allocation which had been deferred till now. > - */ > - if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE) > - ext4_da_update_reserve_space(inode, count, 1); > - > got_it: > map->m_flags |= EXT4_MAP_MAPPED; > map->m_pblk = le32_to_cpu(chain[depth-1].key); > -- > 2.39.2 > -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR