March 20, 2025 at 20:32, "Michal Luczaj" <mhal@xxxxxxx> wrote: > > On 3/17/25 10:22, Jiayuan Chen wrote: > > > > > The sk->sk_socket is not locked or referenced, and during the call to > > > > skb_send_sock(), there is a race condition with the release of sk_socket. > > > > All types of sockets(tcp/udp/unix/vsock) will be affected. > > > > ... > > > > Some approach I tried > > > > ... > > > > 2. Increased the reference of sk_socket->file: > > > > - If the user calls close(fd), we will do nothing because the reference > > > > count is not set to 0. It's unexpected. > > > > Have you considered bumping file's refcnt only for the time of > > send/callback? Along the lines of: > > static struct file *sock_get_file(struct sock *sk) > > { > > struct file *file = NULL; > > struct socket *sock; > > rcu_read_lock(); > > sock = sk->sk_socket; > > if (sock) > > file = get_file_active(&sock->file); > > rcu_read_unlock(); > > return file; > > } > > static int sk_psock_handle_skb(struct sk_psock *psock, struct sk_buff *skb, > > u32 off, u32 len, bool ingress) > > { > > int err; > > if (!ingress) { > > struct sock *sk = psock->sk; > > struct file *file; > > ... > > file = sock_get_file(sk); > > if (!file) > > return -EIO; > > err = skb_send_sock(sk, skb, off, len); > > fput(file); > > return err; > > } > > ... > > } > > static void sk_psock_verdict_data_ready(struct sock *sk) > > { > > struct file *file; > > ... > > file = sock_get_file(sk); > > if (!file) > > return; > > copied = sk->sk_socket->ops->read_skb(sk, sk_psock_verdict_recv); > > fput(file); > > ... > > } > Thank you for your suggestions. I previously attempted a similar approach in another version, but felt that manipulating the file layer within sockmap was quite odd. Moreover, the actual process flow is more complex than we initially imagined. The current overall close process looks roughly like this: ''' close(fd): file_ref_put(file): __sock_release(sock) sock_map_close() saved_close() sk->sk_socket = NULL sock->ops = NULL sock->file = NULL ''' We can observe that while sk->sk_socket might not be NULL, it doesn’t guarantee that sock->file is not NULL. This means the following code is problematic: ''' sock = sk->sk_socket; if (sock) sock->file->xx <== sock->file may be set to NULL ''' Of course, we might try something like: ''' try_hold_sock() { rcu_read_lock(); sock = sk->sk_socket; if (!sock) unlock_return; file = sock->file; if (!file) unlock_return; file = get_file_active(&file); rcu_read_unlock(); return file; } ''' Acquiring the socket's reference count requires significant effort, and we need to pay special attention to the socket's own release process to ensure correctness. Ultimately, this led me to decide to operate on psock instead of directly manipulating the file layer to avoid this issue.