On Mon, 2023-10-09 at 16:30 -0400, Jeff Layton wrote: > I hit this KASAN pop today while rebooting a machine. This kernel > contains the contents of Chuck's nfsd-next branch, so it should have > all of Neil's latest thread handling patches. > > I was able to reproduce this by just running "systemctl stop nfs- > server". This does not occur on stock v6.6-rc5: > > Oct 09 15:26:08 f38-nfsd kernel: ================================================================== > Oct 09 15:26:08 f38-nfsd kernel: BUG: KASAN: slab-use-after-free in svc_exit_thread+0x10c/0x150 [sunrpc] > Oct 09 15:26:08 f38-nfsd kernel: Write of size 8 at addr ffff888112147bb8 by task lockd/973 > Oct 09 15:26:08 f38-nfsd kernel: > Oct 09 15:26:08 f38-nfsd kernel: CPU: 5 PID: 973 Comm: lockd Not tainted 6.6.0-rc4-g2615e14a89be #8 > Oct 09 15:26:08 f38-nfsd kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc38 04/01/2014 > Oct 09 15:26:08 f38-nfsd kernel: Call Trace: > Oct 09 15:26:08 f38-nfsd kernel: <TASK> > Oct 09 15:26:08 f38-nfsd kernel: dump_stack_lvl+0x43/0x60 > Oct 09 15:26:08 f38-nfsd kernel: print_report+0xc5/0x620 > Oct 09 15:26:08 f38-nfsd kernel: ? __pfx__raw_spin_lock_irqsave+0x10/0x10 > Oct 09 15:26:08 f38-nfsd kernel: ? __virt_addr_valid+0xbe/0x130 > Oct 09 15:26:08 f38-nfsd kernel: ? svc_exit_thread+0x10c/0x150 [sunrpc] > Oct 09 15:26:08 f38-nfsd kernel: kasan_report+0xac/0xe0 > Oct 09 15:26:08 f38-nfsd kernel: ? svc_exit_thread+0x10c/0x150 [sunrpc] > Oct 09 15:26:08 f38-nfsd kernel: kasan_check_range+0xf0/0x1a0 > Oct 09 15:26:08 f38-nfsd kernel: svc_exit_thread+0x10c/0x150 [sunrpc] > Oct 09 15:26:08 f38-nfsd kernel: lockd+0x15d/0x1a0 [lockd] > Oct 09 15:26:08 f38-nfsd kernel: ? __pfx_lockd+0x10/0x10 [lockd] > Oct 09 15:26:08 f38-nfsd kernel: kthread+0x181/0x1c0 > Oct 09 15:26:08 f38-nfsd kernel: ? __pfx_kthread+0x10/0x10 > Oct 09 15:26:08 f38-nfsd kernel: ret_from_fork+0x2d/0x50 > Oct 09 15:26:08 f38-nfsd kernel: ? __pfx_kthread+0x10/0x10 > Oct 09 15:26:08 f38-nfsd kernel: ret_from_fork_asm+0x1b/0x30 > Oct 09 15:26:08 f38-nfsd kernel: </TASK> > Oct 09 15:26:08 f38-nfsd kernel: > Oct 09 15:26:08 f38-nfsd kernel: Allocated by task 835: > Oct 09 15:26:08 f38-nfsd kernel: kasan_save_stack+0x1c/0x40 > Oct 09 15:26:08 f38-nfsd kernel: kasan_set_track+0x21/0x30 > Oct 09 15:26:08 f38-nfsd kernel: __kasan_kmalloc+0xa6/0xb0 > Oct 09 15:26:08 f38-nfsd kernel: __kmalloc+0x5d/0x160 > Oct 09 15:26:08 f38-nfsd kernel: __svc_create+0x2b0/0x470 [sunrpc] > Oct 09 15:26:08 f38-nfsd kernel: lockd_up+0x111/0x3b0 [lockd] > Oct 09 15:26:08 f38-nfsd kernel: nfsd_svc+0x54a/0x580 [nfsd] > Oct 09 15:26:08 f38-nfsd kernel: write_threads+0x148/0x260 [nfsd] > Oct 09 15:26:08 f38-nfsd kernel: nfsctl_transaction_write+0x76/0xa0 [nfsd] > Oct 09 15:26:08 f38-nfsd kernel: vfs_write+0x1b4/0x670 > Oct 09 15:26:08 f38-nfsd kernel: ksys_write+0xbd/0x160 > Oct 09 15:26:08 f38-nfsd kernel: do_syscall_64+0x38/0x90 > Oct 09 15:26:08 f38-nfsd kernel: entry_SYSCALL_64_after_hwframe+0x6e/0xd8 > Oct 09 15:26:08 f38-nfsd kernel: > Oct 09 15:26:08 f38-nfsd kernel: Freed by task 973: > Oct 09 15:26:08 f38-nfsd kernel: kasan_save_stack+0x1c/0x40 > Oct 09 15:26:08 f38-nfsd kernel: kasan_set_track+0x21/0x30 > Oct 09 15:26:08 f38-nfsd kernel: kasan_save_free_info+0x27/0x40 > Oct 09 15:26:08 f38-nfsd kernel: ____kasan_slab_free+0x166/0x1c0 > Oct 09 15:26:08 f38-nfsd kernel: slab_free_freelist_hook+0x9f/0x1e0 > Oct 09 15:26:08 f38-nfsd kernel: __kmem_cache_free+0x187/0x2d0 > Oct 09 15:26:08 f38-nfsd kernel: svc_destroy+0x105/0x1d0 [sunrpc] > Oct 09 15:26:08 f38-nfsd kernel: svc_exit_thread+0x135/0x150 [sunrpc] > Oct 09 15:26:08 f38-nfsd kernel: lockd+0x15d/0x1a0 [lockd] > Oct 09 15:26:08 f38-nfsd kernel: kthread+0x181/0x1c0 > Oct 09 15:26:08 f38-nfsd kernel: ret_from_fork+0x2d/0x50 > Oct 09 15:26:08 f38-nfsd kernel: ret_from_fork_asm+0x1b/0x30 > Oct 09 15:26:08 f38-nfsd kernel: > Oct 09 15:26:08 f38-nfsd kernel: The buggy address belongs to the object at ffff888112147b00 > which belongs to the cache kmalloc-192 of size 192 > Oct 09 15:26:08 f38-nfsd kernel: The buggy address is located 184 bytes inside of > freed 192-byte region [ffff888112147b00, ffff888112147bc0) > Oct 09 15:26:08 f38-nfsd kernel: > Oct 09 15:26:08 f38-nfsd kernel: The buggy address belongs to the physical page: > Oct 09 15:26:08 f38-nfsd kernel: page:00000000b3fc2355 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888112147d00 pfn:0x112146 > Oct 09 15:26:08 f38-nfsd kernel: head:00000000b3fc2355 order:1 entire_mapcount:0 nr_pages_mapped:0 pincount:0 > Oct 09 15:26:08 f38-nfsd kernel: flags: 0x17fffc000000840(slab|head|node=0|zone=2|lastcpupid=0x1ffff) > Oct 09 15:26:08 f38-nfsd kernel: page_type: 0xffffffff() > Oct 09 15:26:08 f38-nfsd kernel: raw: 017fffc000000840 ffff888100042a00 ffffea0004098790 ffffea00043e1b10 > Oct 09 15:26:08 f38-nfsd kernel: raw: ffff888112147d00 000000000020000f 00000001ffffffff 0000000000000000 > Oct 09 15:26:08 f38-nfsd kernel: page dumped because: kasan: bad access detected > Oct 09 15:26:08 f38-nfsd kernel: > Oct 09 15:26:08 f38-nfsd kernel: Memory state around the buggy address: > Oct 09 15:26:08 f38-nfsd kernel: ffff888112147a80: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc > Oct 09 15:26:08 f38-nfsd kernel: ffff888112147b00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > Oct 09 15:26:08 f38-nfsd kernel: >ffff888112147b80: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc > Oct 09 15:26:08 f38-nfsd kernel: ^ > Oct 09 15:26:08 f38-nfsd kernel: ffff888112147c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > Oct 09 15:26:08 f38-nfsd kernel: ffff888112147c80: 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc > Oct 09 15:26:08 f38-nfsd kernel: ================================================================== > Some follow up info. It looks like the comment in svc_exit_thread about it not being the last reference may no longer be true for lockd? $ ./scripts/faddr2line --list net/sunrpc/sunrpc.ko svc_exit_thread+0x10c/0x150 (inlined by) svc_exit_thread at /home/jlayton/git/linux/net/sunrpc/svc.c:943 938 svc_put(serv); 939 /* That svc_put() cannot be the last, because the thread 940 * waiting for SP_VICTIM_REMAINS to clear must hold 941 * a reference. So it is still safe to access pool. 942 */ >943< clear_and_wake_up_bit(SP_VICTIM_REMAINS, &pool->sp_flags); 944 } 945 EXPORT_SYMBOL_GPL(svc_exit_thread); 946 947 /* 948 * Register an "inet" protocol family netid with the local $ ./scripts/faddr2line --list net/sunrpc/sunrpc.ko svc_exit_thread+0x135/0x150 (inlined by) svc_exit_thread at /home/jlayton/git/linux/net/sunrpc/svc.c:938 933 spin_unlock_bh(&serv->sv_lock); 934 svc_sock_update_bufs(serv); 935 936 svc_rqst_free(rqstp); 937 >938< svc_put(serv); 939 /* That svc_put() cannot be the last, because the thread 940 * waiting for SP_VICTIM_REMAINS to clear must hold 941 * a reference. So it is still safe to access pool. 942 */ 943 clear_and_wake_up_bit(SP_VICTIM_REMAINS, &pool->sp_flags); -- Jeff Layton <jlayton@xxxxxxxxxx>