Re: rcu_sync_dtor() warning question

Christian Brauner <brauner@xxxxxxxxxx> · Mon, 29 Jul 2024 10:28:03 +0200

On Fri, Jul 26, 2024 at 08:43:01PM GMT, Mateusz Guzik wrote:
> On Fri, Jul 26, 2024 at 8:39 PM Mateusz Guzik <mjguzik@xxxxxxxxx> wrote:
> >
> > On Fri, Jul 26, 2024 at 8:02 PM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> > >
> > > On Fri, Jul 26, 2024 at 03:54:28PM +0200, Christian Brauner wrote:
> > > > Hey,
> > > >
> > > > I could use some help with understanding a bug related to rcu that was
> > > > reported today. It first seems to have shown up on the 25th of July:
> > > >
> > > > https://syzkaller.appspot.com/bug?extid=20d7e439f76bbbd863a7
> > > >
> > > > We seem to be hitting the WARN_ON_ONCE() in:
> > > >
> > > > void rcu_sync_dtor(struct rcu_sync *rsp)
> > > > {
> > > >         int gp_state;
> > > >
> > > >         WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_PASSED);
> > > >
> > > > from destroy_super_work() which gets called when a superblock is really freed.
> > > >
> > > > If the superblock has been visible in userspace we do it via call_rcu():
> > > >
> > > > static void destroy_super_work(struct work_struct *work)
> > > > {
> > > >         struct super_block *s = container_of(work, struct super_block,
> > > >                                                         destroy_work);
> > > >         fsnotify_sb_free(s);
> > > >         security_sb_free(s);
> > > >         put_user_ns(s->s_user_ns);
> > > >         kfree(s->s_subtype);
> > > >         for (int i = 0; i < SB_FREEZE_LEVELS; i++)
> > > >                 percpu_free_rwsem(&s->s_writers.rw_sem[i]);
> > > >         kfree(s);
> > > > }
> > > >
> > > > static void destroy_super_rcu(struct rcu_head *head)
> > > > {
> > > >         struct super_block *s = container_of(head, struct super_block, rcu);
> > > >         INIT_WORK(&s->destroy_work, destroy_super_work);
> > > >         schedule_work(&s->destroy_work);
> > > > }
> > > >
> > > > And I'm really confused because I don't understand the details for sync
> > > > rcu enough to come up with a clear problem statement even. Could someone
> > > > please explain what the WARN_ON_ONCE() is about?
> > >
> > > If I am not too confused (and Oleg will correct me if I am), this is
> > > checking a use-after-free error.  A given rcu_sync structure normally
> > > transitions from GP_IDLE->GP_ENTER->GP_PASSED->GP_EXIT->GP_IDLE, with
> > > possible side-trips from the GP_EXIT state through GP_REPLAY and back
> > > to GP_EXIT in special cases such as during early boot.
> > >
> >
> > use-after-free? In that case I have a candidate for a culprit.
> >
> > Code prior to any of my changes was doing the following in iget_locked():
> >         spin_lock(&inode_hash_lock);
> >         inode = find_inode_fast(sb, head, ino);
> >         spin_unlock(&inode_hash_lock);
> >         if (inode) {
> >                 if (IS_ERR(inode))
> >                         return NULL;
> >                 wait_on_inode(inode);
> >                 if (unlikely(inode_unhashed(inode))) {
> >                         iput(inode);
> >                         goto again;
> >                 }
> >                 return inode;
> >         }
> >
> > My patch removed the spinlock acquire and made it significantly more
> > likely for the code to end up doing the wait_on_inode + inode_unhashed
> > combo when racing against inode teardown.
> >
> > Now that you bring up use-after-free I'm not particularly confident
> > the stock code is correct.
> >
> > For example evict_inodes() -> evict() can mess with the inode and
> > result in the iput() call in iget_locked(), which then will invoke
> > evict() again. And I'm not particularly confident the routine +
> > everything it calls is idempotent.
> >
> > That's from a quick poke around, maybe I missed something.
> >
> > syzkaller claims to have a reproducer. Trivial usage in my debug vm
> > does not result in anything, so I may need to grab their entire setup
> > to reproduce.
> >
> > I'm going to look into it.
> 
> Welp.
> 
> syzbot did the bisect, it's not any of the above, instead:
> 
> commit b62e71be2110d8b52bf5faf3c3ed7ca1a0c113a5
> Author: Chao Yu <chao@xxxxxxxxxx>
> Date:   Sun Apr 23 15:49:15 2023 +0000
> 
>     f2fs: support errors=remount-ro|continue|panic mountoption
> 
> https://lore.kernel.org/linux-fsdevel/0000000000004ff2dc061e281637@xxxxxxxxxx/T/#m90c03813e12e5cdff1eeada8f9ab581d5f039c76
> 
> That said, the stuff I mentioned still looks highly suspicious so I
> have to something to investigate regardless.

Thanks for looking into this!