Re: [bpf PATCH] bpf, sockmap: RCU splat with TLS redirect and strparser

John Fastabend <john.fastabend@xxxxxxxxx> · Thu, 25 Jun 2020 09:46:55 -0700

Martin KaFai Lau wrote:
> On Wed, Jun 24, 2020 at 02:09:23PM -0700, John Fastabend wrote:
> > Redirect on non-TLS sockmap side has RCU lock held from sockmap code
> > path but when called from TLS this is no longer true. The RCU section
> > is needed because we use rcu dereference to fetch the psock of the
> > socket we are redirecting to.
> sk_psock_verdict_apply() is also called by sk_psock_strp_read() after
> rcu_read_unlock().  This issue should not be limited to tls?

The base case is covered because the non-TLS case is wrapped in
rcu_read_lock/unlock here,

 static void sk_psock_strp_data_ready(struct sock *sk)
 {
	struct sk_psock *psock;

	rcu_read_lock();
	psock = sk_psock(sk);
	if (likely(psock)) {
		if (tls_sw_has_ctx_rx(sk)) {
			psock->parser.saved_data_ready(sk);
		} else {
			write_lock_bh(&sk->sk_callback_lock);
			strp_data_ready(&psock->parser.strp);
			write_unlock_bh(&sk->sk_callback_lock);
		}
	}
	rcu_read_unlock();
 }

There is a case that has existed for awhile where if a skb_clone()
fails or alloc_skb_for_msg() fails when building a merged skb. We
could call back into sk_psock_strp_read() from a workqueue in
strparser that would not be covered by above sk_psock_strp_data_ready().
This would hit the sk_psock_verdict_apply() you caught above.

We don't actually see this from selftests because in selftests we
always return skb->len indicating a msg is a single skb. In our
use cases this is all we've ever used to date so we've not actually
hit the case you call out. Another case that might hit this, based
on code review, is some of the zero copy TX cases.

To fix the above case I think its best to submit another patch
because the Fixes tag will be different. Sound OK? I could push
them as a two patch series if that is easier to understand.

Also I should have a patch shortly to allow users to skip setting
up a parse_msg hook for the strparser. This helps because the
vast majority of use cases I've seen just use skb->len as the
message deliminator. It also bypasses above concern.

Thanks,
John