RE: [PATCH] tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: John Fastabend [mailto:john.fastabend@xxxxxxxxx]
> Sent: Friday, October 1, 2021 6:25 AM
> To: liujian (CE) <liujian56@xxxxxxxxxx>; john.fastabend@xxxxxxxxx;
> daniel@xxxxxxxxxxxxx; jakub@xxxxxxxxxxxxxx; lmb@xxxxxxxxxxxxxx;
> edumazet@xxxxxxxxxx; davem@xxxxxxxxxxxxx; yoshfuji@xxxxxxxxxxxxxx;
> dsahern@xxxxxxxxxx; kuba@xxxxxxxxxx; ast@xxxxxxxxxx; andrii@xxxxxxxxxx;
> kafai@xxxxxx; songliubraving@xxxxxx; yhs@xxxxxx; kpsingh@xxxxxxxxxx;
> netdev@xxxxxxxxxxxxxxx; bpf@xxxxxxxxxxxxxxx
> Cc: liujian (CE) <liujian56@xxxxxxxxxx>
> Subject: RE: [PATCH] tcp_bpf: Fix one concurrency problem in the
> tcp_bpf_send_verdict function
> 
> Liu Jian wrote:
> > In the following cases:
> > We need to redirect the first msg to sock1 and the second msg to sock2.
> > The sock lock needs to be released at __SK_REDIRECT and to get another
> > sock lock, this will cause the probability that psock->eval is not set
> > to __SK_NONE when the second msg comes.
> >
> > If psock does not set apple bytes, fix this by do the cleanup before
> > releasing the sock lock. And keep the original logic in other cases.
> 
> It took me sometime to figure out the above description. Please include a bit
> more details here this needs to be backported so we want to be very clear
> what the error  is and how to trigger it.
> 
> In this case we should list the flow to show how the interleaving of msgs
> breaks.
> 
> "
> With two Msgs, msgA and msgB and a user doing nonblocking sendmsg calls
> (or multiple cores) on a single socket 'sk' we could get the following flow.
> 
>  msgA, sk                               msgB, sk
>  -----------                            ---------------
>  tcp_bpf_sendmsg()
>  lock(sk)
>  psock = sk->psock
>                                         tcp_bpf_sendmsg()
>                                         lock(sk) ... blocking  tcp_bpf_send_verdict  if (psock-
> >eval == NONE)
>    psock->eval = sk_psock_msg_verdict
>  ..
>  < handle SK_REDIRECT case >
>    release_sock(sk)                     < lock dropped so grab here >
>    ret = tcp_bpf_sendmsg_redir
>                                         psock = sk->psock
>                                         tcp_bpf_send_verdict
>  lock_sock(sk) ... blocking on B
>                                         if (psock->eval == NONE) <- boom.
>                                          psock->eval will have msgA state
> 
> The problem here is we dropped the lock on msgA and grabbed it with msgB.
> Now we have old state in psock and importantly psock->eval has not been
> cleared. So msgB will run whatever action was done on A and the verdict
> program may never see it.
> "
> 
> Showing the flow makes it painfully obvious why dropping that lock with old
> state is broken.
> 
Thanks a lot for such a detailed example.
> 
> >
> > Signed-off-by: Liu Jian <liujian56@xxxxxxxxxx>
> 
> We need a fixes tag as well so we can backport this.
 I will add it.
> 
> > ---
> >  net/ipv4/tcp_bpf.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index
> > d3e9386b493e..02442e43ac4d 100644
> > --- a/net/ipv4/tcp_bpf.c
> > +++ b/net/ipv4/tcp_bpf.c
> > @@ -232,6 +232,7 @@ static int tcp_bpf_send_verdict(struct sock *sk,
> struct sk_psock *psock,
> >  	bool cork = false, enospc = sk_msg_full(msg);
> >  	struct sock *sk_redir;
> >  	u32 tosend, delta = 0;
> > +	u32 eval = __SK_NONE;
> >  	int ret;
> >
> >  more_data:
> > @@ -274,6 +275,12 @@ static int tcp_bpf_send_verdict(struct sock *sk,
> struct sk_psock *psock,
> >  		break;
> >  	case __SK_REDIRECT:
> >  		sk_redir = psock->sk_redir;
> > +		if (!psock->apply_bytes) {
> > +			/* Clean up before releasing the sock lock. */
> > +			eval = psock->eval;
> > +			psock->eval = __SK_NONE;
> > +			psock->sk_redir = NULL;
> > +		}
> 
> We need to move above chunk below sk_msg_apply_bytes() so we account
> for the bytes and if we zero apply bytes with this send we clear the psock
> state. Otherwise we could have the same issue with stale state on the
> boundary where apply bytes is met.
> 
> >  		sk_msg_apply_bytes(psock, tosend);
> 
> <-- put above chunk here.
yes, here looks better. 
> 
> >  		if (psock->cork) {
> >  			cork = true;
> 
> Interestingly, I caught the race with cork state, but missed it with the eval
> case. Likely because our program redirected to a single sk.
> 
Yes. 
> > @@ -281,7 +288,12 @@ static int tcp_bpf_send_verdict(struct sock *sk,
> struct sk_psock *psock,
> >  		}
> >  		sk_msg_return(sk, msg, tosend);
> >  		release_sock(sk);
> > +
> >  		ret = tcp_bpf_sendmsg_redir(sk_redir, msg, tosend, flags);
> > +
> > +		if (eval == __SK_REDIRECT)
> 
> Is the 'if' needed? we are in this case because eval is SK_REDIRECT.
> 
Need it, because If the "apply bytes" is not zero, i did not change the logic.
> > +			sock_put(sk_redir);
> > +
> >  		lock_sock(sk);
> >  		if (unlikely(ret < 0)) {
> >  			int free = sk_msg_free_nocharge(sk, msg);
> > --
> > 2.17.1
> >




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux