On Thu, Sep 26, 2024 at 12:48:19PM +0200, David Hildenbrand wrote: > On 25.09.24 18:59, Peter Xu wrote: > > On Tue, Sep 24, 2024 at 04:45:00PM +0200, David Hildenbrand wrote: > > > On 23.09.24 14:18, syzbot wrote: > > > > Hello, > > > > > > > > syzbot found the following issue on: > > > > > > > > HEAD commit: 88264981f208 Merge tag 'sched_ext-for-6.12' of git://git.k.. > > > > git tree: upstream > > > > console+strace: https://syzkaller.appspot.com/x/log.txt?x=16c36c27980000 > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=e851828834875d6f > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=bf2c35fa302ebe3c7471 > > > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12773080580000 > > > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16ed5e9f980000 > > > > > > > > Downloadable assets: > > > > disk image: https://storage.googleapis.com/syzbot-assets/0e011ac37c93/disk-88264981.raw.xz > > > > vmlinux: https://storage.googleapis.com/syzbot-assets/f5c65577e19e/vmlinux-88264981.xz > > > > kernel image: https://storage.googleapis.com/syzbot-assets/984d963c8ea1/bzImage-88264981.xz > > > > > > > > The issue was bisected to: > > > > > > > > commit 75182022a0439788415b2dd1db3086e07aa506f7 > > > > Author: Peter Xu <peterx@xxxxxxxxxx> > > > > Date: Mon Aug 26 20:43:51 2024 +0000 > > > > > > > > mm/x86: support large pfn mappings > > > > > > > > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=17df9c27980000 > > > > final oops: https://syzkaller.appspot.com/x/report.txt?x=143f9c27980000 > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=103f9c27980000 > > > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > > > Reported-by: syzbot+bf2c35fa302ebe3c7471@xxxxxxxxxxxxxxxxxxxxxxxxx > > > > Fixes: 75182022a043 ("mm/x86: support large pfn mappings") > > > > > > > > ------------[ cut here ]------------ > > > > WARNING: CPU: 1 PID: 5508 at mm/huge_memory.c:1602 copy_huge_pmd+0x102c/0x1c60 mm/huge_memory.c:1602 > > > > > > This is the > > > > > > VM_WARN_ON_ONCE(is_cow_mapping(src_vma->vm_flags) && pmd_write(pmd)) > > > > > > So we have a special-marked PMD in a COW mapping. > > > > > > The reproducer seems to involve fuse, but not sure if that makes a > > > difference here. > > > > That chunk of code seems to be there only making sure the test won't get > > blocked due to any fused based fs being stuck, via writting to the "abort" > > file: > > > > snprintf(abort, sizeof(abort), "/sys/fs/fuse/connections/%s/abort", > > ent->d_name); > > int fd = open(abort, O_WRONLY); > > if (fd == -1) { > > continue; > > } > > if (write(fd, abort, 1) < 0) { > > } > > close(fd); > > > > So far looks not relevant to this issue indeed. > > > > Unfortunately I cannot reproduce it even with the reproducer. So this one > > is a bit tricky.. > > > > What confuses me yet is how that special bit is set, if it's only used so > > far with vfio-pci, and this test doesn't seem to have it involved. > > > > The test keeps invoking processes, then threads, doing concurrent accesses > > over a few stuff (madvise, mremap, migrate_pages, munmap, etc.) on the > > pre-mapped areas, but none of them seem to create new memory that can > > provide hint on how special bit can start to occur. > > > > I wonder if some of these operations can race in a way that mm can wrongly > > create the special bit (alone with it being writable).. and then it could > > be a historical bug, only captured by this patchset due to the newly added > > WARN_ON_ONCE somehow, then it could mean that it's not the WRITE bit that > > is not intended, but the SPECIAL bit altogether. > > I assume you are missing a check for present/non-swap pmds. Assume you have > a migration entry and end up using the special bit -- which is perfectly > fine -- your code would assume it's a present PMD with the special bit set. > > Maybe for the time being something like: > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 0580ac9e47b9..e55efcad1e6c 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -1586,7 +1586,7 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct > mm_struct *src_mm, > int ret = -ENOMEM; > > pmd = pmdp_get_lockless(src_pmd); > - if (unlikely(pmd_special(pmd))) { > + if (unlikely(pmd_present(pmd) && pmd_special(pmd))) { > dst_ptl = pmd_lock(dst_mm, dst_pmd); > src_ptl = pmd_lockptr(src_mm, src_pmd); > spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); Good catch! I definitely overlooked it, and I did check the config has THP_MIGRATION set indeed. So it's very possible relevant. Do you want to send a formal patch? You can also push a branch with "#syz test", looks like syzbot can constantly reproduce. Thanks! -- Peter Xu