Jason Xing wrote: > From: Jason Xing <kernelxing@xxxxxxxxxxx> > > Fix NULL pointer data-races in sk_psock_skb_ingress_enqueue() which > syzbot reported [1]. > > [1] > BUG: KCSAN: data-race in sk_psock_drop / sk_psock_skb_ingress_enqueue > > write to 0xffff88814b3278b8 of 8 bytes by task 10724 on cpu 1: > sk_psock_stop_verdict net/core/skmsg.c:1257 [inline] > sk_psock_drop+0x13e/0x1f0 net/core/skmsg.c:843 > sk_psock_put include/linux/skmsg.h:459 [inline] > sock_map_close+0x1a7/0x260 net/core/sock_map.c:1648 > unix_release+0x4b/0x80 net/unix/af_unix.c:1048 > __sock_release net/socket.c:659 [inline] > sock_close+0x68/0x150 net/socket.c:1421 > __fput+0x2c1/0x660 fs/file_table.c:422 > __fput_sync+0x44/0x60 fs/file_table.c:507 > __do_sys_close fs/open.c:1556 [inline] > __se_sys_close+0x101/0x1b0 fs/open.c:1541 > __x64_sys_close+0x1f/0x30 fs/open.c:1541 > do_syscall_64+0xd3/0x1d0 > entry_SYSCALL_64_after_hwframe+0x6d/0x75 > > read to 0xffff88814b3278b8 of 8 bytes by task 10713 on cpu 0: > sk_psock_data_ready include/linux/skmsg.h:464 [inline] > sk_psock_skb_ingress_enqueue+0x32d/0x390 net/core/skmsg.c:555 > sk_psock_skb_ingress_self+0x185/0x1e0 net/core/skmsg.c:606 > sk_psock_verdict_apply net/core/skmsg.c:1008 [inline] > sk_psock_verdict_recv+0x3e4/0x4a0 net/core/skmsg.c:1202 > unix_read_skb net/unix/af_unix.c:2546 [inline] > unix_stream_read_skb+0x9e/0xf0 net/unix/af_unix.c:2682 > sk_psock_verdict_data_ready+0x77/0x220 net/core/skmsg.c:1223 > unix_stream_sendmsg+0x527/0x860 net/unix/af_unix.c:2339 > sock_sendmsg_nosec net/socket.c:730 [inline] > __sock_sendmsg+0x140/0x180 net/socket.c:745 > ____sys_sendmsg+0x312/0x410 net/socket.c:2584 > ___sys_sendmsg net/socket.c:2638 [inline] > __sys_sendmsg+0x1e9/0x280 net/socket.c:2667 > __do_sys_sendmsg net/socket.c:2676 [inline] > __se_sys_sendmsg net/socket.c:2674 [inline] > __x64_sys_sendmsg+0x46/0x50 net/socket.c:2674 > do_syscall_64+0xd3/0x1d0 > entry_SYSCALL_64_after_hwframe+0x6d/0x75 > > value changed: 0xffffffff83d7feb0 -> 0x0000000000000000 > > Reported by Kernel Concurrency Sanitizer on: > CPU: 0 PID: 10713 Comm: syz-executor.4 Tainted: G W 6.8.0-syzkaller-08951-gfe46a7dd189e #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024 > > Prior to this, commit 4cd12c6065df ("bpf, sockmap: Fix NULL pointer > dereference in sk_psock_verdict_data_ready()") fixed one NULL pointer > similarly due to no protection of saved_data_ready. Here is another > different caller causing the same issue because of the same reason. So > we should protect it with sk_callback_lock read lock because the writer > side in the sk_psock_drop() uses "write_lock_bh(&sk->sk_callback_lock);". > > To avoid errors that could happen in future, I move those two pairs of > lock into the sk_psock_data_ready(), which is suggested by John Fastabend. > > Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") > Reported-by: syzbot+aa8c8ec2538929f18f2d@xxxxxxxxxxxxxxxxxxxxxxxxx > Closes: https://syzkaller.appspot.com/bug?extid=aa8c8ec2538929f18f2d > Signed-off-by: Jason Xing <kernelxing@xxxxxxxxxxx> > --- > v2 > Link: https://lore.kernel.org/all/20240329134037.92124-1-kerneljasonxing@xxxxxxxxx/ > 1. move the read_lock_bh()/unlock_bh() into the sk_psock_data_ready() call (John) > Signed-off-by: Jason Xing <kernelxing@xxxxxxxxxxx> > --- Reviewed-by: John Fastabend <john.fastabend@xxxxxxxxx>