Re: [PATCH bpf 1/2] bpf: sockmap, fix introduced strparser recursive lock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 25, 2024 at 01:16 PM -07, John Fastabend wrote:
> Originally there was a race where removing a psock from the sock map while
> it was also receiving an skb and calling sk_psock_data_ready(). It was
> possible the removal code would NULL/set the data_ready callback while
> concurrently calling the hook from receive path. The fix was to wrap the
> access in sk_callback_lock to ensure the saved_data_ready pointer didn't
> change under us. There was some discussion around doing a larger change
> to ensure we could use READ_ONCE/WRITE_ONCE over the callback, but that
> was for *next kernels not stable fixes.
>
> But, we unfortunately introduced a regression with the fix because there
> is another path into this code (that didn't have a test case) through
> the stream parser. The stream parser runs with the lower lock which means
> we get the following splat and lock up.
>
>
>  ============================================
>  WARNING: possible recursive locking detected
>  6.10.0-rc2 #59 Not tainted
>  --------------------------------------------
>  test_sockmap/342 is trying to acquire lock:
>  ffff888007a87228 (clock-AF_INET){++--}-{2:2}, at:
>  sk_psock_skb_ingress_enqueue (./include/linux/skmsg.h:467
>  net/core/skmsg.c:555)
>
>  but task is already holding lock:
>  ffff888007a87228 (clock-AF_INET){++--}-{2:2}, at:
>  sk_psock_strp_data_ready (net/core/skmsg.c:1120)
>
> To fix ensure we do not grap lock when we reach this code through the
> strparser.
>
> Fixes: 6648e613226e1 ("bpf, skmsg: Fix NULL pointer dereference in sk_psock_skb_ingress_enqueue")
> Reported-by: Vincent Whitchurch <vincent.whitchurch@xxxxxxxxxxxxx>
> Signed-off-by: John Fastabend <john.fastabend@xxxxxxxxx>
> ---
>  include/linux/skmsg.h | 9 +++++++--
>  net/core/skmsg.c      | 5 ++++-
>  2 files changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
> index c9efda9df285..3659e9b514d0 100644
> --- a/include/linux/skmsg.h
> +++ b/include/linux/skmsg.h
> @@ -461,13 +461,18 @@ static inline void sk_psock_put(struct sock *sk, struct sk_psock *psock)
>  		sk_psock_drop(sk, psock);
>  }
>  
> -static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock)
> +static inline void __sk_psock_data_ready(struct sock *sk, struct sk_psock *psock)
>  {
> -	read_lock_bh(&sk->sk_callback_lock);
>  	if (psock->saved_data_ready)
>  		psock->saved_data_ready(sk);
>  	else
>  		sk->sk_data_ready(sk);
> +}
> +
> +static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock)
> +{
> +	read_lock_bh(&sk->sk_callback_lock);
> +	__sk_psock_data_ready(sk, psock);
>  	read_unlock_bh(&sk->sk_callback_lock);
>  }
>  
> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> index fd20aae30be2..8429daecbbb6 100644
> --- a/net/core/skmsg.c
> +++ b/net/core/skmsg.c
> @@ -552,7 +552,10 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb,
>  	msg->skb = skb;
>  
>  	sk_psock_queue_msg(psock, msg);
> -	sk_psock_data_ready(sk, psock);
> +	if (skb_bpf_strparser(skb))
> +		__sk_psock_data_ready(sk, psock);
> +	else
> +		sk_psock_data_ready(sk, psock);
>  	return copied;
>  }

If I follow, this is the call chain that leads to the recursive lock:

sock::sk_data_ready → sk_psock_strp_data_ready
    write_lock_bh(&sk->sk_callback_lock)
    strp_data_ready
      strp_read_sock
        proto_ops::read_sock → tcp_read_sock
          strp_recv
            __strp_recv
              strp_callbacks::rcv_msg → sk_psock_strp_read
                  sk_psock_verdict_apply(verdict=__SK_PASS)
                    sk_psock_skb_ingress_self
                      sk_psock_skb_ingress_enqueue
                        sk_psock_data_ready
                          read_lock_bh(&sk->sk_callback_lock) !!!

What I don't get, though, is why strp_data_ready has to be called with a
_writer_ lock? Maybe that should just be a reader lock, and then it can
be recursive.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux