On Fri, 2024-03-08 at 12:01 -0800, Kuniyuki Iwashima wrote: > syzkaller reported a warning of netns tracker [0] followed by KASAN > splat [1] and another ref tracker warning [1]. > > syzkaller could not find a repro, but in the log, the only suspicious > sequence was as follows: > > 18:26:22 executing program 1: > r0 = socket$inet6_mptcp(0xa, 0x1, 0x106) > ... > connect$inet6(r0, &(0x7f0000000080)={0xa, 0x4001, 0x0, @loopback}, 0x1c) (async) > > The notable thing here is 0x4001 in connect(), which is RDS_TCP_PORT. > > So, the scenario would be: > > 1. unshare(CLONE_NEWNET) creates a per netns tcp listener in > rds_tcp_listen_init(). > 2. syz-executor connect()s to it and creates a reqsk. > 3. syz-executor exit()s immediately. > 4. netns is dismantled. [0] > 5. reqsk timer is fired, and UAF happens while freeing reqsk. [1] > 6. listener is freed after RCU grace period. [2] > > Basically, reqsk assumes that the listener guarantees netns safety > until all reqsk timers are expired by holding the listener's refcount. > However, this was not the case for kernel sockets. > > Commit 740ea3c4a0b2 ("tcp: Clean up kernel listener's reqsk in > inet_twsk_purge()") fixed this issue only for per-netns ehash. > > Let's apply the same fix for the global ehash. > > [0]: > ref_tracker: net notrefcnt@0000000065449cc3 has 1/1 users at > sk_alloc (./include/net/net_namespace.h:337 net/core/sock.c:2146) > inet6_create (net/ipv6/af_inet6.c:192 net/ipv6/af_inet6.c:119) > __sock_create (net/socket.c:1572) > rds_tcp_listen_init (net/rds/tcp_listen.c:279) > rds_tcp_init_net (net/rds/tcp.c:577) > ops_init (net/core/net_namespace.c:137) > setup_net (net/core/net_namespace.c:340) > copy_net_ns (net/core/net_namespace.c:497) > create_new_namespaces (kernel/nsproxy.c:110) > unshare_nsproxy_namespaces (kernel/nsproxy.c:228 (discriminator 4)) > ksys_unshare (kernel/fork.c:3429) > __x64_sys_unshare (kernel/fork.c:3496) > do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83) > entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129) > ... > WARNING: CPU: 0 PID: 27 at lib/ref_tracker.c:179 ref_tracker_dir_exit (lib/ref_tracker.c:179) > > [1]: > BUG: KASAN: slab-use-after-free in inet_csk_reqsk_queue_drop (./include/net/inet_hashtables.h:180 net/ipv4/inet_connection_sock.c:952 net/ipv4/inet_connection_sock.c:966) > Read of size 8 at addr ffff88801b370400 by task swapper/0/0 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 > Call Trace: > <IRQ> > dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1)) > print_report (mm/kasan/report.c:378 mm/kasan/report.c:488) > kasan_report (mm/kasan/report.c:603) > inet_csk_reqsk_queue_drop (./include/net/inet_hashtables.h:180 net/ipv4/inet_connection_sock.c:952 net/ipv4/inet_connection_sock.c:966) > reqsk_timer_handler (net/ipv4/inet_connection_sock.c:979 net/ipv4/inet_connection_sock.c:1092) > call_timer_fn (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/timer.h:127 kernel/time/timer.c:1701) > __run_timers.part.0 (kernel/time/timer.c:1752 kernel/time/timer.c:2038) > run_timer_softirq (kernel/time/timer.c:2053) > __do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/irq.h:142 kernel/softirq.c:554) > irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632 kernel/softirq.c:644) > sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1076 (discriminator 14)) > </IRQ> > > Allocated by task 258 on cpu 0 at 83.612050s: > kasan_save_stack (mm/kasan/common.c:48) > kasan_save_track (mm/kasan/common.c:68) > __kasan_slab_alloc (mm/kasan/common.c:343) > kmem_cache_alloc (mm/slub.c:3813 mm/slub.c:3860 mm/slub.c:3867) > copy_net_ns (./include/linux/slab.h:701 net/core/net_namespace.c:421 net/core/net_namespace.c:480) > create_new_namespaces (kernel/nsproxy.c:110) > unshare_nsproxy_namespaces (kernel/nsproxy.c:228 (discriminator 4)) > ksys_unshare (kernel/fork.c:3429) > __x64_sys_unshare (kernel/fork.c:3496) > do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83) > entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129) > > Freed by task 27 on cpu 0 at 329.158864s: > kasan_save_stack (mm/kasan/common.c:48) > kasan_save_track (mm/kasan/common.c:68) > kasan_save_free_info (mm/kasan/generic.c:643) > __kasan_slab_free (mm/kasan/common.c:265) > kmem_cache_free (mm/slub.c:4299 mm/slub.c:4363) > cleanup_net (net/core/net_namespace.c:456 net/core/net_namespace.c:446 net/core/net_namespace.c:639) > process_one_work (kernel/workqueue.c:2638) > worker_thread (kernel/workqueue.c:2700 kernel/workqueue.c:2787) > kthread (kernel/kthread.c:388) > ret_from_fork (arch/x86/kernel/process.c:153) > ret_from_fork_asm (arch/x86/entry/entry_64.S:250) > > The buggy address belongs to the object at ffff88801b370000 > which belongs to the cache net_namespace of size 4352 > The buggy address is located 1024 bytes inside of > freed 4352-byte region [ffff88801b370000, ffff88801b371100) > > [2]: > WARNING: CPU: 0 PID: 95 at lib/ref_tracker.c:228 ref_tracker_free (lib/ref_tracker.c:228 (discriminator 1)) > Modules linked in: > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 > RIP: 0010:ref_tracker_free (lib/ref_tracker.c:228 (discriminator 1)) > ... > Call Trace: > <IRQ> > __sk_destruct (./include/net/net_namespace.h:353 net/core/sock.c:2204) > rcu_core (./arch/x86/include/asm/preempt.h:26 kernel/rcu/tree.c:2165 kernel/rcu/tree.c:2433) > __do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/irq.h:142 kernel/softirq.c:554) > irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632 kernel/softirq.c:644) > sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1076 (discriminator 14)) > </IRQ> > > Reported-by: syzkaller <syzkaller@xxxxxxxxxxxxxxxx> > Suggested-by: Eric Dumazet <edumazet@xxxxxxxxxx> > Fixes: 467fa15356ac ("RDS-TCP: Support multiple RDS-TCP listen endpoints, one per netns.") > Signed-off-by: Kuniyuki Iwashima <kuniyu@xxxxxxxxxx> Eric, the patches LGTM, and I can see your suggested-by tag, but better safe then sorry: could you please confirm it is ok for you too? Thanks, Paolo