On Wed, Jun 15, 2022 at 9:02 PM Yu Zhao <yuzhao@xxxxxxxxxx> wrote: > > On Wed, Jun 15, 2022 at 8:56 PM Liam Howlett <liam.howlett@xxxxxxxxxx> wrote: > > > > * Yu Zhao <yuzhao@xxxxxxxxxx> [220615 21:59]: > > > On Wed, Jun 15, 2022 at 7:50 PM Liam Howlett <liam.howlett@xxxxxxxxxx> wrote: > > > > > > > > * Yu Zhao <yuzhao@xxxxxxxxxx> [220615 17:17]: > > > > > > > > ... > > > > > > > > > > Yes, I used the same parameters with 512GB of RAM, and the kernel with > > > > > > KASAN and other debug options. > > > > > > > > > > Sorry, Liam. I got the same crash :( > > > > > > > > Thanks for running this promptly. I am trying to get my own server > > > > setup now. > > > > > > > > > > > > > > 9d27f2f1487a (tag: mm-everything-2022-06-14-19-05, akpm/mm-everything) > > > > > 00d4d7b519d6 fs/userfaultfd: Fix vma iteration in mas_for_each() loop > > > > > 55140693394d maple_tree: Make mas_prealloc() error checking more generic > > > > > 2d7e7c2fcf16 maple_tree: Fix mt_destroy_walk() on full non-leaf non-alloc nodes > > > > > 4d4472148ccd maple_tree: Change spanning store to work on larger trees > > > > > ea36bcc14c00 test_maple_tree: Add tests for preallocations and large > > > > > spanning writes > > > > > 0d2aa86ead4f mm/mlock: Drop dead code in count_mm_mlocked_page_nr() > > > > > > > > > > ================================================================== > > > > > BUG: KASAN: slab-out-of-bounds in mab_mas_cp+0x2d9/0x6c0 > > > > > Write of size 136 at addr ffff88c35a3b9e80 by task stress-ng/19303 > > > > > > > > > > CPU: 66 PID: 19303 Comm: stress-ng Tainted: G S I 5.19.0-smp-DEV #1 > > > > > Call Trace: > > > > > <TASK> > > > > > dump_stack_lvl+0xc5/0xf4 > > > > > print_address_description+0x7f/0x460 > > > > > print_report+0x10b/0x240 > > > > > ? mab_mas_cp+0x2d9/0x6c0 > > > > > kasan_report+0xe6/0x110 > > > > > ? mast_spanning_rebalance+0x2634/0x29b0 > > > > > ? mab_mas_cp+0x2d9/0x6c0 > > > > > kasan_check_range+0x2ef/0x310 > > > > > ? mab_mas_cp+0x2d9/0x6c0 > > > > > ? mab_mas_cp+0x2d9/0x6c0 > > > > > memcpy+0x44/0x70 > > > > > mab_mas_cp+0x2d9/0x6c0 > > > > > mas_spanning_rebalance+0x1a3e/0x4f90 > > > > > > > > Does this translate to an inline around line 2997? > > > > And then probably around 2808? > > > > > > $ ./scripts/faddr2line vmlinux mab_mas_cp+0x2d9 > > > mab_mas_cp+0x2d9/0x6c0: > > > mab_mas_cp at lib/maple_tree.c:1988 > > > $ ./scripts/faddr2line vmlinux mas_spanning_rebalance+0x1a3e > > > mas_spanning_rebalance+0x1a3e/0x4f90: > > > mast_cp_to_nodes at lib/maple_tree.c:? > > > (inlined by) mas_spanning_rebalance at lib/maple_tree.c:2997 > > > $ ./scripts/faddr2line vmlinux mas_wr_spanning_store+0x16c5 > > > mas_wr_spanning_store+0x16c5/0x1b80: > > > mas_wr_spanning_store at lib/maple_tree.c:? > > > > > > No idea why faddr2line didn't work for the last two addresses. GDB > > > seems more reliable. > > > > > > (gdb) li *(mab_mas_cp+0x2d9) > > > 0xffffffff8226b049 is in mab_mas_cp (lib/maple_tree.c:1988). > > > (gdb) li *(mas_spanning_rebalance+0x1a3e) > > > 0xffffffff822633ce is in mas_spanning_rebalance (lib/maple_tree.c:2801). > > > quit) > > > (gdb) li *(mas_wr_spanning_store+0x16c5) > > > 0xffffffff8225cfb5 is in mas_wr_spanning_store (lib/maple_tree.c:4030). > > > > > > Thanks. I am not having luck recreating it. I am hitting what looks > > like an unrelated issue in the unstable mm, "scheduling while atomic". > > I will try the git commit you indicate above. > > Fix here: > https://lore.kernel.org/linux-mm/20220615160446.be1f75fd256d67e57b27a9fc@xxxxxxxxxxxxxxxxxxxx/ A seemingly new crash on arm64: KASAN: null-ptr-deref in range [0x0000000000000000-0x000000000000000f] pc : __hwasan_check_x2_67043363+0x4/0x34 lr : mas_wr_walk_descend+0xe0/0x2c0 sp : ffffffc0164378d0 x29: ffffffc0164378f0 x28: 13ffff8028ee7328 x27: ffffffc016437a68 x26: 0dffff807aa63710 x25: ffffffc016437a60 x24: 51ffff8028ee1928 x23: ffffffc016437a78 x22: ffffffc0164379e0 x21: ffffffc016437998 x20: efffffc000000000 x19: ffffffc016437998 x18: 07ffff8077718180 x17: 45ffff800b366010 x16: 0000000000000000 x15: 9cffff8092bfcdf0 x14: ffffffefef411b8c x13: 0000000000000001 x12: 0000000000000002 x11: ffffffffffffff00 x10: 0000000000000000 x9 : efffffc000000000 x8 : ffffffc016437a60 x7 : 0000000000000000 x6 : ffffffefef8246cc x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffffffeff0bf48ee x2 : 0000000000000008 x1 : ffffffc0164379b8 x0 : ffffffc016437998 Call trace: __hwasan_check_x2_67043363+0x4/0x34 mas_wr_store_entry+0x178/0x5c0 mas_store+0x88/0xc8 dup_mmap+0x4bc/0x6d8 dup_mm+0x8c/0x17c copy_mm+0xb0/0x12c copy_process+0xa44/0x17d4 kernel_clone+0x100/0x2cc __arm64_sys_clone+0xf4/0x120 el0_svc_common+0xfc/0x1cc do_el0_svc_compat+0x38/0x5c el0_svc_compat+0x68/0xf4 el0t_32_sync_handler+0xc0/0xf0 el0t_32_sync+0x190/0x194 Code: aa0203e0 d2800441 141e931d 9344dc50 (38706930)