On Fri, Jul 26, 2024 at 11:02:35AM GMT, Paul E. McKenney wrote: > On Fri, Jul 26, 2024 at 03:54:28PM +0200, Christian Brauner wrote: > > Hey, > > > > I could use some help with understanding a bug related to rcu that was > > reported today. It first seems to have shown up on the 25th of July: > > > > https://syzkaller.appspot.com/bug?extid=20d7e439f76bbbd863a7 > > > > We seem to be hitting the WARN_ON_ONCE() in: > > > > void rcu_sync_dtor(struct rcu_sync *rsp) > > { > > int gp_state; > > > > WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_PASSED); > > > > from destroy_super_work() which gets called when a superblock is really freed. > > > > If the superblock has been visible in userspace we do it via call_rcu(): > > > > static void destroy_super_work(struct work_struct *work) > > { > > struct super_block *s = container_of(work, struct super_block, > > destroy_work); > > fsnotify_sb_free(s); > > security_sb_free(s); > > put_user_ns(s->s_user_ns); > > kfree(s->s_subtype); > > for (int i = 0; i < SB_FREEZE_LEVELS; i++) > > percpu_free_rwsem(&s->s_writers.rw_sem[i]); > > kfree(s); > > } > > > > static void destroy_super_rcu(struct rcu_head *head) > > { > > struct super_block *s = container_of(head, struct super_block, rcu); > > INIT_WORK(&s->destroy_work, destroy_super_work); > > schedule_work(&s->destroy_work); > > } > > > > And I'm really confused because I don't understand the details for sync > > rcu enough to come up with a clear problem statement even. Could someone > > please explain what the WARN_ON_ONCE() is about? > > If I am not too confused (and Oleg will correct me if I am), this is > checking a use-after-free error. A given rcu_sync structure normally > transitions from GP_IDLE->GP_ENTER->GP_PASSED->GP_EXIT->GP_IDLE, with > possible side-trips from the GP_EXIT state through GP_REPLAY and back > to GP_EXIT in special cases such as during early boot. > > One way that this can happen is calling percpu_free_rwsem() without > having previously done the percpu_up_write() needed to match an earlier > percpu_down_write(). I don't see any other way that this can happen > right off-hand, but someone might have added something, or I might be > suffering a failure of imagination. And you were spot on of course! > > > (I had been wondering if this is related to commit 7180f8d91fcb ("vfs: > > add rcu-based find_inode variants for iget ops") and the fix in commit > > 5bc9ad78c2f8 ("vfs: handle __wait_on_freeing_inode() and evict() race") > > but I don't see how.) > > I don't see how either, unless it introduced some timing change > that made the scenario above more probable. > > > Thanks for taking a look! > > Hope this helps! If not, you know where to find me. ;-) Thank you for taking the time, Paul. That was indeed helpful!