Re: [syzbot] [ext4?] WARNING in mb_cache_destroy

Jan Kara <jack@xxxxxxx> · Fri, 3 May 2024 16:09:16 +0200

On Fri 03-05-24 19:38:21, Baokun Li wrote:
> On 2024/5/3 18:23, Jan Kara wrote:
> > Hi!
> > 
> > On Fri 03-05-24 17:51:07, Baokun Li wrote:
> > > On 2024/5/2 18:33, Jan Kara wrote:
> > > > On Tue 30-04-24 08:04:03, syzbot wrote:
> > > > > syzbot has bisected this issue to:
> > > > > 
> > > > > commit 67d7d8ad99beccd9fe92d585b87f1760dc9018e3
> > > > > Author: Baokun Li <libaokun1@xxxxxxxxxx>
> > > > > Date:   Thu Jun 16 02:13:56 2022 +0000
> > > > > 
> > > > >       ext4: fix use-after-free in ext4_xattr_set_entry
> > > > So I'm not sure the bisect is correct since the change is looking harmless.
> > > Yes, the root cause of the problem has nothing to do with this patch,
> > > and please see the detailed analysis below.
> > > > But it is sufficiently related that there indeed may be some relationship.
> > > > Anyway, the kernel log has:
> > > > 
> > > > [   44.932900][ T1063] EXT4-fs warning (device loop0): ext4_evict_inode:297: xattr delete (err -12)
> > > > [   44.943316][ T1063] EXT4-fs (loop0): unmounting filesystem.
> > > > [   44.949531][ T1063] ------------[ cut here ]------------
> > > > [   44.955050][ T1063] WARNING: CPU: 0 PID: 1063 at fs/mbcache.c:409 mb_cache_destroy+0xda/0x110
> > > > 
> > > > So ext4_xattr_delete_inode() called when removing inode has failed with
> > > > ENOMEM and later mb_cache_destroy() was eventually complaining about having
> > > > mbcache entry with increased refcount. So likely some error cleanup path is
> > > > forgetting to drop mbcache entry reference somewhere but at this point I
> > > > cannot find where. We'll likely need to play with the reproducer to debug
> > > > that. Baokun, any chance for looking into this?
> > > > 
> > > > 								Honza
> > > As you guessed, when -ENOMEM is returned in ext4_sb_bread(),
> > > the reference count of ce is not properly released, as follows.
> > > 
> > > ext4_create
> > >   __ext4_new_inode
> > >    security_inode_init_security
> > >     ext4_initxattrs
> > >      ext4_xattr_set_handle
> > >       ext4_xattr_block_find
> > >       ext4_xattr_block_set
> > >        ext4_xattr_block_cache_find
> > >          ce = mb_cache_entry_find_first
> > >              __entry_find
> > >              atomic_inc_not_zero(&entry->e_refcnt)
> > >          bh = ext4_sb_bread(inode->i_sb, ce->e_value, REQ_PRIO);
> > >          if (PTR_ERR(bh) == -ENOMEM)
> > >              return NULL;
> > > 
> > > Before merging into commit 67d7d8ad99be("ext4: fix use-after-free
> > > in ext4_xattr_set_entry"), it will not return early in
> > > ext4_xattr_ibody_find(),
> > > so it tries to find it in iboy, fails the check in xattr_check_inode() and
> > > returns without executing ext4_xattr_block_find(). Thus it will bisect
> > > the patch, but actually has nothing to do with it.
> > > 
> > > ext4_xattr_ibody_get
> > >   xattr_check_inode
> > >    __xattr_check_inode
> > >     check_xattrs
> > >      if (end - (void *)header < sizeof(*header) + sizeof(u32))
> > >        "in-inode xattr block too small"
> > > 
> > > Here's the patch in testing, I'll send it out officially after it is tested.
> > > (PS:  I'm not sure if propagating the ext4_xattr_block_cache_find() errors
> > > would be better.)
> > Great! Thanks for debugging this! Some comments to your fix below:
> > 
> > > diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
> > > index b67a176bfcf9..5c9e751915fd 100644
> > > --- a/fs/ext4/xattr.c
> > > +++ b/fs/ext4/xattr.c
> > > @@ -3113,11 +3113,10 @@ ext4_xattr_block_cache_find(struct inode *inode,
> > > 
> > >           bh = ext4_sb_bread(inode->i_sb, ce->e_value, REQ_PRIO);
> > >           if (IS_ERR(bh)) {
> > > -            if (PTR_ERR(bh) == -ENOMEM)
> > > -                return NULL;
> > > +            if (PTR_ERR(bh) != -ENOMEM)
> > > +                EXT4_ERROR_INODE(inode, "block %lu read error",
> > > +                         (unsigned long)ce->e_value);
> > >               bh = NULL;
> > > -            EXT4_ERROR_INODE(inode, "block %lu read error",
> > > -                     (unsigned long)ce->e_value);
> > >           } else if (ext4_xattr_cmp(header, BHDR(bh)) == 0) {
> > >               *pce = ce;
> > >               return bh;
> > So if we get the ENOMEM error, continuing the iteration seems to be
> > pointless as we'll likely get it for the following entries as well. I think
> > the original behavior of aborting the iteration in case of ENOMEM is
> > actually better. We just have to do mb_cache_entry_put(ea_block_cache, ce)
> > before returning...
> > 
> > 								Honza
> Returning NULL here would normally attempt to allocate a new
> xattr_block in ext4_xattr_block_set(), and when ext4_sb_bread() fails,
> allocating the new block and inserting it would most likely fail as well,
> so my initial thought was to propagate the error from ext4_sb_bread()
> to also make ext4_xattr_block_set() fail when ext4_sb_bread() fails.

Yes, this would be probably even better solution.

> But I noticed that before Ted added the special handling for -ENOMEM,
> EXT4_ERROR_INODE was called to set the ERROR_FS flag no matter
> what error ext4_sb_bread() returned, and after we can distinguish
> between -EIO and -ENOMEM, we don't have to set the ERROR_FS flag
> in the case of -ENOMEM. So there's this conservative fix now.
> 
> In short, in my personal opinion, for -EIO and -ENOMEM, they should
> be the same except whether or not the ERROR_FS flag is set.
> Otherwise, I think adding mb_cache_entry_put() directly is the easiest
> and most straightforward fix.  Honza, do you have any other thoughts?

Yeah. I'd go for adding mb_cache_entry_put() now as a quick fix and then
work on propagating the error from ext4_xattr_block_cache_find() as a
cleaner solution...

								Honza

-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR