On Wed, Jun 15, 2022 at 1:05 PM Yu Zhao <yuzhao@xxxxxxxxxx> wrote: > > On Wed, Jun 15, 2022 at 12:55 PM Liam Howlett <liam.howlett@xxxxxxxxxx> wrote: > > > > * Yu Zhao <yuzhao@xxxxxxxxxx> [220615 14:08]: > > > On Wed, Jun 15, 2022 at 8:25 AM Liam Howlett <liam.howlett@xxxxxxxxxx> wrote: > > > > > > > > * Yu Zhao <yuzhao@xxxxxxxxxx> [220611 17:50]: > > > > > On Sat, Jun 11, 2022 at 2:11 PM Yu Zhao <yuzhao@xxxxxxxxxx> wrote: > > > > > > > > > > > > On Mon, Jun 6, 2022 at 10:40 AM Qian Cai <quic_qiancai@xxxxxxxxxxx> wrote: > > > > > > > > > > > > > > On Mon, Jun 06, 2022 at 04:19:52PM +0000, Liam Howlett wrote: > > > > > > > > Does your syscall fuzzer create a reproducer? This looks like arm64 > > > > > > > > and says 5.18.0-next-20220603 again. Was this bisected to the patch > > > > > > > > above? > > > > > > > > > > > > > > This was triggered by running the fuzzer over the weekend. > > > > > > > > > > > > > > $ trinity -C 160 > > > > > > > > > > > > > > No bisection was done. It was only brought up here because the trace > > > > > > > pointed to do_mas_munmap() which was introduced here. > > > > > > > > > > > > Liam, > > > > > > > > > > > > I'm getting a similar crash on arm64 -- the allocator is madvise(), > > > > > > not mprotect(). Please take a look. > > > > > > > > > > Another crash on x86_64, which seems different: > > > > > > > > Thanks for this. I was able to reproduce the other crashes that you and > > > > Qian reported. I've sent out a patch set to Andrew to apply to the > > > > branch which includes the fix for them and an unrelated issue discovered > > > > when I wrote the testcases to cover what was going on here. > > > > > > Thanks. I'm restarting the test and will report the results in a few hours. > > > > > > > > BUG: KASAN: slab-out-of-bounds in mab_mas_cp+0x2d9/0x6c0 > > > > > Write of size 136 at addr ffff88c5a2319c80 by task stress-ng/18461 > > > ^^^^^^^^^ > > > > > > > As for this crash, I was unable to reproduce and the code I just sent > > > > out changes this code a lot. Was this running with "trinity -c madvise" > > > > or another use case/fuzzer? > > > > > > This is also stress-ng (same as the one on arm64). The test stopped > > > before it could try syzkaller (fuzzer). > > > > Thanks. What are the arguments to stress-ng you use? I've run > > "stress-ng --class vm -a 20 -t 600s --temp-path /tmp" until it OOMs on > > my vm, but it only has 8GB of ram. > > Yes, I used the same parameters with 512GB of RAM, and the kernel with > KASAN and other debug options. Sorry, Liam. I got the same crash :( 9d27f2f1487a (tag: mm-everything-2022-06-14-19-05, akpm/mm-everything) 00d4d7b519d6 fs/userfaultfd: Fix vma iteration in mas_for_each() loop 55140693394d maple_tree: Make mas_prealloc() error checking more generic 2d7e7c2fcf16 maple_tree: Fix mt_destroy_walk() on full non-leaf non-alloc nodes 4d4472148ccd maple_tree: Change spanning store to work on larger trees ea36bcc14c00 test_maple_tree: Add tests for preallocations and large spanning writes 0d2aa86ead4f mm/mlock: Drop dead code in count_mm_mlocked_page_nr() ================================================================== BUG: KASAN: slab-out-of-bounds in mab_mas_cp+0x2d9/0x6c0 Write of size 136 at addr ffff88c35a3b9e80 by task stress-ng/19303 CPU: 66 PID: 19303 Comm: stress-ng Tainted: G S I 5.19.0-smp-DEV #1 Call Trace: <TASK> dump_stack_lvl+0xc5/0xf4 print_address_description+0x7f/0x460 print_report+0x10b/0x240 ? mab_mas_cp+0x2d9/0x6c0 kasan_report+0xe6/0x110 ? mast_spanning_rebalance+0x2634/0x29b0 ? mab_mas_cp+0x2d9/0x6c0 kasan_check_range+0x2ef/0x310 ? mab_mas_cp+0x2d9/0x6c0 ? mab_mas_cp+0x2d9/0x6c0 memcpy+0x44/0x70 mab_mas_cp+0x2d9/0x6c0 mas_spanning_rebalance+0x1a3e/0x4f90 ? stack_trace_save+0xca/0x160 ? stack_trace_save+0xca/0x160 mas_wr_spanning_store+0x16c5/0x1b80 mas_wr_store_entry+0xbf9/0x12e0 mas_store_prealloc+0x205/0x3c0 do_mas_align_munmap+0x6cf/0xd10 do_mas_munmap+0x1bb/0x210 ? down_write_killable+0xa6/0x110 __vm_munmap+0x1c4/0x270 __x64_sys_munmap+0x60/0x70 do_syscall_64+0x44/0xa0 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x589827 Code: 00 00 00 48 c7 c2 98 ff ff ff f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb 85 66 2e 0f 1f 84 00 00 00 00 00 90 b8 0b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 98 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffee601ec08 EFLAGS: 00000206 ORIG_RAX: 000000000000000b RAX: ffffffffffffffda RBX: 0000400000000000 RCX: 0000000000589827 RDX: 0000000000000000 RSI: 00007ffffffff000 RDI: 0000000000000000 RBP: 00000000004cf000 R08: 00007ffee601ec40 R09: 0000000000923bf0 R10: 0000000000000008 R11: 0000000000000206 R12: 0000000000001000 R13: 00000000004cf040 R14: 0000000000000002 R15: 00007ffee601ed58 </TASK> Allocated by task 19303: __kasan_slab_alloc+0xaf/0xe0 kmem_cache_alloc_bulk+0x261/0x360 mas_alloc_nodes+0x2d7/0x4d0 mas_preallocate+0xe2/0x230 do_mas_align_munmap+0x1ce/0xd10 do_mas_munmap+0x1bb/0x210 __vm_munmap+0x1c4/0x270 __x64_sys_munmap+0x60/0x70 do_syscall_64+0x44/0xa0 entry_SYSCALL_64_after_hwframe+0x46/0xb0 The buggy address belongs to the object at ffff88c35a3b9e00 which belongs to the cache maple_node of size 256 The buggy address is located 128 bytes inside of 256-byte region [ffff88c35a3b9e00, ffff88c35a3b9f00) The buggy address belongs to the physical page: page:00000000325428b6 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x435a3b9 flags: 0x1400000000000200(slab|node=1|zone=1) raw: 1400000000000200 ffffea010d71a5c8 ffffea010d71dec8 ffff88810004ff00 raw: 0000000000000000 ffff88c35a3b9000 0000000100000008 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88c35a3b9e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff88c35a3b9e80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff88c35a3b9f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff88c35a3b9f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88c35a3ba000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ==================================================================