Re: [PATCH bpf-next v3 1/3] bpf, sockmap: avoid using sk_socket after free when sending

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



March 20, 2025 at 20:32, "Michal Luczaj" <mhal@xxxxxxx> wrote:

> 
> On 3/17/25 10:22, Jiayuan Chen wrote:
> 
> > 
> > The sk->sk_socket is not locked or referenced, and during the call to
> > 
> >  skb_send_sock(), there is a race condition with the release of sk_socket.
> > 
> >  All types of sockets(tcp/udp/unix/vsock) will be affected.
> > 
> >  ...
> > 
> >  Some approach I tried
> > 
> >  ...
> > 
> >  2. Increased the reference of sk_socket->file:
> > 
> >  - If the user calls close(fd), we will do nothing because the reference
> > 
> >  count is not set to 0. It's unexpected.
> > 
> 
> Have you considered bumping file's refcnt only for the time of
> 
> send/callback? Along the lines of:
> 
> static struct file *sock_get_file(struct sock *sk)
> 
> {
> 
>  struct file *file = NULL;
> 
>  struct socket *sock;
> 
>  rcu_read_lock();
> 
>  sock = sk->sk_socket;
> 
>  if (sock)
> 
>  file = get_file_active(&sock->file);
> 
>  rcu_read_unlock();
> 
>  return file;
> 
> }
> 
> static int sk_psock_handle_skb(struct sk_psock *psock, struct sk_buff *skb,
> 
>  u32 off, u32 len, bool ingress)
> 
> {
> 
>  int err;
> 
>  if (!ingress) {
> 
>  struct sock *sk = psock->sk;
> 
>  struct file *file;
> 
>  ...
> 
>  file = sock_get_file(sk);
> 
>  if (!file)
> 
>  return -EIO;
> 
>  err = skb_send_sock(sk, skb, off, len);
> 
>  fput(file);
> 
>  return err;
> 
>  }
> 
>  ...
> 
> }
> 
> static void sk_psock_verdict_data_ready(struct sock *sk)
> 
> {
> 
>  struct file *file;
> 
>  ...
> 
>  file = sock_get_file(sk);
> 
>  if (!file)
> 
>  return;
> 
>  copied = sk->sk_socket->ops->read_skb(sk, sk_psock_verdict_recv);
> 
>  fput(file);
> 
>  ...
> 
> }
>

Thank you for your suggestions.

I previously attempted a similar approach in another version, but
felt that manipulating the file layer within sockmap was quite odd.

Moreover, the actual process flow is more complex than we initially
imagined.

The current overall close process looks roughly like this:
'''
close(fd):
    file_ref_put(file):
        __sock_release(sock)
            sock_map_close()
            saved_close()
               sk->sk_socket = NULL
            sock->ops = NULL
            sock->file = NULL
'''

We can observe that while sk->sk_socket might not be NULL, it doesn’t
guarantee that sock->file is not NULL. This means the following code
is problematic:
'''
sock = sk->sk_socket;
if (sock)
    sock->file->xx <== sock->file may be set to NULL
'''

Of course, we might try something like:
'''
try_hold_sock() {
    rcu_read_lock();
    
    sock = sk->sk_socket;
    if (!sock)
        unlock_return;
    
    file = sock->file;
    if (!file)
        unlock_return;
    
    file = get_file_active(&file);
    rcu_read_unlock();
    return file;
}
'''

Acquiring the socket's reference count requires significant effort,
and we need to pay special attention to the socket's own release
process to ensure correctness. Ultimately, this led me to decide to
operate on psock instead of directly manipulating the file layer to
avoid this issue.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux