Hi Andrew - TL;DR of this is - please apply the fix patch attached below to fix a problem in this series, thanks! :) On Tue, Sep 03, 2024 at 12:00:04PM GMT, Lorenzo Stoakes wrote: > On Tue, Sep 03, 2024 at 11:07:38AM GMT, Pengfei Xu wrote: > > Hi Liam R. Howlett, > > > > Greetings! > > > > There is WARNING in __split_vma in next-20240902 in syzkaller fuzzing test. > > Bisected and found first bad commit: > > " > > 3483c95414f9 mm: change failure of MAP_FIXED to restoring the gap on failure > > " > > It's same as below patch. > > After reverted the above commit on top of next-20240902, this issue was gone. > > > > All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/240903_092137___split_vma > > Syzkaller repro code: https://github.com/xupengfe/syzkaller_logs/blob/main/240903_092137___split_vma/repro.c > > Syzkaller repro syscall steps: https://github.com/xupengfe/syzkaller_logs/blob/main/240903_092137___split_vma/repro.prog > > Syzkaller report: https://github.com/xupengfe/syzkaller_logs/blob/main/240903_092137___split_vma/repro.report > > Kconfig(make olddefconfig): https://github.com/xupengfe/syzkaller_logs/blob/main/240903_092137___split_vma/kconfig_origin > > Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/240903_092137___split_vma/bisect_info.log > > bzImage: https://github.com/xupengfe/syzkaller_logs/raw/main/240903_092137___split_vma/bzImage_ecc768a84f0b8e631986f9ade3118fa37852fef0.tar.gz > > Issue dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/240903_092137___split_vma/ecc768a84f0b8e631986f9ade3118fa37852fef0_dmesg.log > > > > And "KASAN: slab-use-after-free Read in acct_collect" also pointed to the > > same commit, all detailed info: > > https://github.com/xupengfe/syzkaller_logs/tree/main/240903_090000_acct_collect > > > > " > > Thanks for the report! Looking into it. > > > [ 19.953726] cgroup: Unknown subsys name 'net' > > [ 20.045121] cgroup: Unknown subsys name 'rlimit' > > [ 20.138332] ------------[ cut here ]------------ > > [ 20.138634] WARNING: CPU: 1 PID: 732 at include/linux/maple_tree.h:733 __split_vma+0x4d7/0x1020 > > [ 20.139075] Modules linked in: > > [ 20.139245] CPU: 1 UID: 0 PID: 732 Comm: repro Not tainted 6.11.0-rc6-next-20240902-ecc768a84f0b #1 > > [ 20.139779] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 > > [ 20.140337] RIP: 0010:__split_vma+0x4d7/0x1020 > > [ 20.140572] Code: 89 ee 48 8b 40 10 48 89 c7 48 89 85 00 ff ff ff e8 8e 61 a7 ff 48 8b 85 00 ff ff ff 4c 39 e8 0f 83 ea fd ff ff e8 b9 5e a7 ff <0f> 0b e9 de fd ff ff 48 8b 85 30 ff ff ff 48 83 c0 10 48 89 85 18 > > [ 20.141476] RSP: 0018:ffff8880217379a0 EFLAGS: 00010293 > > [ 20.141749] RAX: 0000000000000000 RBX: ffff8880132351e0 RCX: ffffffff81bf6117 > > [ 20.142106] RDX: ffff888012c30000 RSI: ffffffff81bf6187 RDI: 0000000000000006 > > [ 20.142457] RBP: ffff888021737aa0 R08: 0000000000000001 R09: ffffed100263d3cd > > [ 20.142814] R10: 0000000020ff9000 R11: 0000000000000001 R12: ffff888021737e40 > > [ 20.143173] R13: 0000000020ff9000 R14: 0000000020ffc000 R15: ffff888013235d20 > > [ 20.143529] FS: 00007eff937f9740(0000) GS:ffff88806c500000(0000) knlGS:0000000000000000 > > [ 20.144308] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 20.144600] CR2: 0000000020000040 CR3: 000000001f464003 CR4: 0000000000770ef0 > > [ 20.144958] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > [ 20.145313] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400 > > [ 20.145665] PKRU: 55555554 > > [ 20.145809] Call Trace: > > [ 20.145940] <TASK> > > [ 20.146056] ? show_regs+0x6d/0x80 > > [ 20.146247] ? __warn+0xf3/0x380 > > [ 20.146431] ? report_bug+0x25e/0x4b0 > > [ 20.146650] ? __split_vma+0x4d7/0x1020 > > Have repro'd locally. This is, unsurprisingly, on this line (even if trace above > doesn't decode to it unfortunately): > > vma_iter_config(vmi, new->vm_start, new->vm_end); > > The VMA in question spans 0x20ff9000, 0x21000000 so is 7 pages in size. > > At the point of invoking vma_iter_config(), the vma iterator points at > 0x20ff9001, but we try to position it to 0x20ff9000. > > It seems the issue is that in do_vmi_munmap(), after vma_find() is called, we > find a VMA at 0x20ff9000, but the VMI is positioned to 0x20ff9001...! > > Perhaps maple tree corruption in a previous call somehow? > > > I can interestingly only repro this if I clear the qemu image each time, I'm > guessing this is somehow tied to the instantiation of the cgroup setup or such? > > Am continuing the investigation. > [snip] OK I turned on CONFIG_DEBUG_VM_MAPLE_TREE and am hitting VM_WARN_ON_ONCE_MM(vma->vm_start != vmi_start, mm) after gather_failed is hit in mmap_region() as a result of call_mmap() returning an error. This is invoking kernfs_fop_mmap(), which returns -ENODEV because the KERNFS_HAS_MMAP flag has not been set for the cgroup file being mapped. This results in mmap_region() jumping to unmap_and_free_vma, which unmaps the page tables in the region and goes on to abort the unmap operation. The validate_mm() that fails is called by vms_complete_munmap_vmas() which was invoked by vms_abort_munmap_vmas(). The tree is then corrupted: vma ffff888013414d20 start 0000000020ff9000 end 0000000021000000 mm ffff88800d06ae40 prot 25 anon_vma ffff8880132cc660 vm_ops 0000000000000000 pgoff 20ff9 file 0000000000000000 private_data 0000000000000000 flags: 0x8100077(read|write|exec|mayread|maywrite|mayexec|account|softdirty) tree range: ffff888013414d20 start 20ff9001 end 20ffffff Incorrectly starting at off-by-one 0x20ff9001 rather than 0x20ff9000. This is a very telling off-by... the programmer's favourite off-by-1 :) Which then made me think of how mas operations have an _inclusive_ end and VMA ones have an _exclusive_ one. And so I tracked down the cause of this to vms_abort_munmap_vmas() which was invoking mas_set_range() using vms->end (exclusive) as if it were inclusive, which thus resulted in 0x20ff9000 being wrongly cleared. Thes solution is simply to subtract this by one as done in the attached fix-patch. I confirmed this fixed the issue as I was able to set up a reliable repro locally. Thanks for the report! Great find. ----8<----