On Thu 06-04-23 21:28:34, Baokun Li wrote: > If extent status tree update fails, we have inconsistency between what is > stored in the extent status tree and what is stored on disk. And that can > cause even data corruption issues in some cases. > > In the extent status tree, we have extents which we can just drop without > issues and extents we must not drop - this depends on the extent's status > - currently ext4_es_is_delayed() extents must stay, others may be dropped. > > For extents that cannot be dropped we use __GFP_NOFAIL to allocate memory. > A helper function is also added to help determine if the current extent can > be dropped, although only ext4_es_is_delayed() extents cannot be dropped > currently. In addition, with the above logic, the undo operation in > __es_remove_extent that may cause inconsistency if the split extent fails > is unnecessary, so we remove it as well. > > Suggested-by: Jan Kara <jack@xxxxxxx> > Signed-off-by: Baokun Li <libaokun1@xxxxxxxxxx> > --- > V1->V2: > Add the patch 2 as suggested by Jan Kara. > > fs/ext4/extents_status.c | 36 +++++++++++++++++++++++++++++------- > 1 file changed, 29 insertions(+), 7 deletions(-) > > diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c > index 7bc221038c6c..8eed17f35b11 100644 > --- a/fs/ext4/extents_status.c > +++ b/fs/ext4/extents_status.c > @@ -448,12 +448,29 @@ static void ext4_es_list_del(struct inode *inode) > spin_unlock(&sbi->s_es_lock); > } > > +/* > + * Helper function to help determine if memory allocation for this > + * extent_status is allowed to fail. > + */ > +static inline bool ext4_es_alloc_should_nofail(struct extent_status *es) I'd call this function ext4_es_must_keep() and also use it in es_do_reclaim_extents() instead of ext4_es_is_delayed(). Do this as a preparatory patch please. > @@ -792,9 +809,16 @@ static int __es_insert_extent(struct inode *inode, struct extent_status *newes) > } > > es = ext4_es_alloc_extent(inode, newes->es_lblk, newes->es_len, > - newes->es_pblk); > - if (!es) > - return -ENOMEM; > + newes->es_pblk, 0); I would just call this like: es = ext4_es_alloc_extent(inode, newes->es_lblk, newes->es_len, newes->es_pblk, ext4_es_must_keep(newes)); to save the ifs below. > + if (!es) { > + /* Use GFP_NOFAIL if the allocation cannot fail. */ > + if (ext4_es_alloc_should_nofail(newes)) > + es = ext4_es_alloc_extent(inode, newes->es_lblk, > + newes->es_len, newes->es_pblk, 1); > + else > + return -ENOMEM; > + } > + > rb_link_node(&es->rb_node, parent, p); > rb_insert_color(&es->rb_node, &tree->root); > > @@ -1349,8 +1373,6 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk, > ext4_es_status(&orig_es)); > err = __es_insert_extent(inode, &newes); > if (err) { > - es->es_lblk = orig_es.es_lblk; > - es->es_len = orig_es.es_len; > if ((err == -ENOMEM) && > __es_shrink(EXT4_SB(inode->i_sb), > 128, EXT4_I(inode))) Also now __es_remove_extent() cannot fail (it will always remove what it should, maybe more) so please just make it void function (as a separate cleanup patch afterwards). Thanks! Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR