On Fri 03-05-24 19:38:21, Baokun Li wrote: > On 2024/5/3 18:23, Jan Kara wrote: > > Hi! > > > > On Fri 03-05-24 17:51:07, Baokun Li wrote: > > > On 2024/5/2 18:33, Jan Kara wrote: > > > > On Tue 30-04-24 08:04:03, syzbot wrote: > > > > > syzbot has bisected this issue to: > > > > > > > > > > commit 67d7d8ad99beccd9fe92d585b87f1760dc9018e3 > > > > > Author: Baokun Li <libaokun1@xxxxxxxxxx> > > > > > Date: Thu Jun 16 02:13:56 2022 +0000 > > > > > > > > > > ext4: fix use-after-free in ext4_xattr_set_entry > > > > So I'm not sure the bisect is correct since the change is looking harmless. > > > Yes, the root cause of the problem has nothing to do with this patch, > > > and please see the detailed analysis below. > > > > But it is sufficiently related that there indeed may be some relationship. > > > > Anyway, the kernel log has: > > > > > > > > [ 44.932900][ T1063] EXT4-fs warning (device loop0): ext4_evict_inode:297: xattr delete (err -12) > > > > [ 44.943316][ T1063] EXT4-fs (loop0): unmounting filesystem. > > > > [ 44.949531][ T1063] ------------[ cut here ]------------ > > > > [ 44.955050][ T1063] WARNING: CPU: 0 PID: 1063 at fs/mbcache.c:409 mb_cache_destroy+0xda/0x110 > > > > > > > > So ext4_xattr_delete_inode() called when removing inode has failed with > > > > ENOMEM and later mb_cache_destroy() was eventually complaining about having > > > > mbcache entry with increased refcount. So likely some error cleanup path is > > > > forgetting to drop mbcache entry reference somewhere but at this point I > > > > cannot find where. We'll likely need to play with the reproducer to debug > > > > that. Baokun, any chance for looking into this? > > > > > > > > Honza > > > As you guessed, when -ENOMEM is returned in ext4_sb_bread(), > > > the reference count of ce is not properly released, as follows. > > > > > > ext4_create > > > __ext4_new_inode > > > security_inode_init_security > > > ext4_initxattrs > > > ext4_xattr_set_handle > > > ext4_xattr_block_find > > > ext4_xattr_block_set > > > ext4_xattr_block_cache_find > > > ce = mb_cache_entry_find_first > > > __entry_find > > > atomic_inc_not_zero(&entry->e_refcnt) > > > bh = ext4_sb_bread(inode->i_sb, ce->e_value, REQ_PRIO); > > > if (PTR_ERR(bh) == -ENOMEM) > > > return NULL; > > > > > > Before merging into commit 67d7d8ad99be("ext4: fix use-after-free > > > in ext4_xattr_set_entry"), it will not return early in > > > ext4_xattr_ibody_find(), > > > so it tries to find it in iboy, fails the check in xattr_check_inode() and > > > returns without executing ext4_xattr_block_find(). Thus it will bisect > > > the patch, but actually has nothing to do with it. > > > > > > ext4_xattr_ibody_get > > > xattr_check_inode > > > __xattr_check_inode > > > check_xattrs > > > if (end - (void *)header < sizeof(*header) + sizeof(u32)) > > > "in-inode xattr block too small" > > > > > > Here's the patch in testing, I'll send it out officially after it is tested. > > > (PS: I'm not sure if propagating the ext4_xattr_block_cache_find() errors > > > would be better.) > > Great! Thanks for debugging this! Some comments to your fix below: > > > > > diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c > > > index b67a176bfcf9..5c9e751915fd 100644 > > > --- a/fs/ext4/xattr.c > > > +++ b/fs/ext4/xattr.c > > > @@ -3113,11 +3113,10 @@ ext4_xattr_block_cache_find(struct inode *inode, > > > > > > bh = ext4_sb_bread(inode->i_sb, ce->e_value, REQ_PRIO); > > > if (IS_ERR(bh)) { > > > - if (PTR_ERR(bh) == -ENOMEM) > > > - return NULL; > > > + if (PTR_ERR(bh) != -ENOMEM) > > > + EXT4_ERROR_INODE(inode, "block %lu read error", > > > + (unsigned long)ce->e_value); > > > bh = NULL; > > > - EXT4_ERROR_INODE(inode, "block %lu read error", > > > - (unsigned long)ce->e_value); > > > } else if (ext4_xattr_cmp(header, BHDR(bh)) == 0) { > > > *pce = ce; > > > return bh; > > So if we get the ENOMEM error, continuing the iteration seems to be > > pointless as we'll likely get it for the following entries as well. I think > > the original behavior of aborting the iteration in case of ENOMEM is > > actually better. We just have to do mb_cache_entry_put(ea_block_cache, ce) > > before returning... > > > > Honza > Returning NULL here would normally attempt to allocate a new > xattr_block in ext4_xattr_block_set(), and when ext4_sb_bread() fails, > allocating the new block and inserting it would most likely fail as well, > so my initial thought was to propagate the error from ext4_sb_bread() > to also make ext4_xattr_block_set() fail when ext4_sb_bread() fails. Yes, this would be probably even better solution. > But I noticed that before Ted added the special handling for -ENOMEM, > EXT4_ERROR_INODE was called to set the ERROR_FS flag no matter > what error ext4_sb_bread() returned, and after we can distinguish > between -EIO and -ENOMEM, we don't have to set the ERROR_FS flag > in the case of -ENOMEM. So there's this conservative fix now. > > In short, in my personal opinion, for -EIO and -ENOMEM, they should > be the same except whether or not the ERROR_FS flag is set. > Otherwise, I think adding mb_cache_entry_put() directly is the easiest > and most straightforward fix. Honza, do you have any other thoughts? Yeah. I'd go for adding mb_cache_entry_put() now as a quick fix and then work on propagating the error from ext4_xattr_block_cache_find() as a cleaner solution... Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR