On Mon, 23 Oct 2023, Liam R. Howlett wrote: > * syzbot <syzbot+79fcba037b6df73756d3@xxxxxxxxxxxxxxxxxxxxxxxxx> [231023 13:24]: > > syzbot has found a reproducer for the following issue on: > > > > HEAD commit: e8361b005d7c Add linux-next specific files for 20231023 > > git tree: linux-next > > console output: https://syzkaller.appspot.com/x/log.txt?x=1207cb05680000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=75e8fc3570ec9a74 > > dashboard link: https://syzkaller.appspot.com/bug?extid=79fcba037b6df73756d3 > > compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=107fab89680000 > > > > Downloadable assets: > > disk image: https://storage.googleapis.com/syzbot-assets/e28a7944599e/disk-e8361b00.raw.xz > > vmlinux: https://storage.googleapis.com/syzbot-assets/7dd355dbe055/vmlinux-e8361b00.xz > > kernel image: https://storage.googleapis.com/syzbot-assets/7b2a9050635d/bzImage-e8361b00.xz > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+79fcba037b6df73756d3@xxxxxxxxxxxxxxxxxxxxxxxxx > > > > ============================= > > WARNING: suspicious RCU usage > > 6.6.0-rc6-next-20231023-syzkaller #0 Not tainted > > ----------------------------- > > lib/maple_tree.c:856 suspicious rcu_dereference_check() usage! > > > > other info that might help us debug this: > > > > > > rcu_scheduler_active = 2, debug_locks = 1 > > no locks held by syz-executor.4/5222. > > > > stack backtrace: > > CPU: 0 PID: 5222 Comm: syz-executor.4 Not tainted 6.6.0-rc6-next-20231023-syzkaller #0 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/06/2023 > > Call Trace: > > <TASK> > > __dump_stack lib/dump_stack.c:88 [inline] > > dump_stack_lvl+0x125/0x1b0 lib/dump_stack.c:106 > > lockdep_rcu_suspicious+0x20b/0x3a0 kernel/locking/lockdep.c:6711 > > mas_root lib/maple_tree.c:856 [inline] > > mas_root lib/maple_tree.c:854 [inline] > > mas_start lib/maple_tree.c:1385 [inline] > > mas_state_walk lib/maple_tree.c:3705 [inline] > > mas_walk+0x4d1/0x7d0 lib/maple_tree.c:4888 > > mas_find_setup lib/maple_tree.c:5948 [inline] > > mas_find+0x1e6/0x400 lib/maple_tree.c:5989 > > vma_find include/linux/mm.h:952 [inline] > > do_mbind+0xc8f/0x1010 mm/mempolicy.c:1328 > > Hugh, > > 41de65c4cd27 ("mempolicy: mmap_lock is not needed while migrating > folios") changes the do_mbind() code locking here to drop the mmap write > lock on line 1300 in e8361b005d7c. Thanks Liam: yes, this is a good helpful find by syzbot. The "mmap_lock is not needed while migrating folios" patch was and is good, but the "attempt to match interleave nodes" patch on top of that then broke it, by adding a vma search after the mmap_lock drop point. > > This is an issue as it opens the VMA (maple) tree to being updated, but > you then re-walk the tree later. If this is okay, then you can add an > rcu_read_lock()/rcu_read_unlock() to iterate over the VMAs so it is > safe (around 1327/1332, respectively). Oh, that's a nice suggestion, thanks. My first inclination was to move the mmap_write_unlock() down, but perhaps the RCU way would be neater. Perhaps, but perhaps not: I'll think more and send a fix patch later in the day. > > I'm not entirely sure why this is safe to do without the mmap write > lock, but considering the change log it seems you have thought through > it. I'm just not sure what is going to stop the VMAs from being split > or such by a ref count on the memory policy (or if it matters if they > are)? Nothing stops those VMAs from being split or unmapped or remapped or re-mbinded or whatever while doing the migrate_pages(&pagelist). But those changes to the VMAs do not affect the work defined for migrate_pages(&pagelist) at all (they may make that work redundant, but such cases would be rare in reallife workloads). Previously, the VMAs were required to choose the migrate-to nodes; but now that choice depends only on the refcounted mpols. Hugh