Re: [PATCH] net: bpf: handle return value of BPF_CGROUP_RUN_PROG_INET4_POST_BIND()

Jakub Kicinski <kuba@xxxxxxxxxx> · Wed, 29 Dec 2021 13:09:27 -0800

On Mon, 27 Dec 2021 14:20:35 +0800 menglong8.dong@xxxxxxxxx wrote:
> From: Menglong Dong <imagedong@xxxxxxxxxxx>
> 
> The return value of BPF_CGROUP_RUN_PROG_INET4_POST_BIND() in
> __inet_bind() is not handled properly. While the return value
> is non-zero, it will set inet_saddr and inet_rcv_saddr to 0 and
> exit:
> 
> 	err = BPF_CGROUP_RUN_PROG_INET4_POST_BIND(sk);
> 	if (err) {
> 		inet->inet_saddr = inet->inet_rcv_saddr = 0;
> 		goto out_release_sock;
> 	}
> 
> Let's take UDP for example and see what will happen. For UDP
> socket, it will be added to 'udp_prot.h.udp_table->hash' and
> 'udp_prot.h.udp_table->hash2' after the sk->sk_prot->get_port()
> called success. If 'inet->inet_rcv_saddr' is specified here,
> then 'sk' will be in the 'hslot2' of 'hash2' that it don't belong
> to (because inet_saddr is changed to 0), and UDP packet received
> will not be passed to this sock. If 'inet->inet_rcv_saddr' is not
> specified here, the sock will work fine, as it can receive packet
> properly, which is wired, as the 'bind()' is already failed.
> 
> I'm not sure what should do here, maybe we should unhash the sock
> for UDP? Therefor, user can try to bind another port?

Enumarating the L4 unwind paths in L3 code seems like a fairly clear
layering violation. A new callback to undo ->sk_prot->get_port() may
be better.

Does IPv6 no need as similar change?

You need to provide a selftest to validate the expected behavior.

> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> index 04067b249bf3..9e5710f40a39 100644
> --- a/net/ipv4/af_inet.c
> +++ b/net/ipv4/af_inet.c
> @@ -530,7 +530,14 @@ int __inet_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len,
>  		if (!(flags & BIND_FROM_BPF)) {
>  			err = BPF_CGROUP_RUN_PROG_INET4_POST_BIND(sk);
>  			if (err) {
> +				if (sk->sk_prot == &udp_prot)
> +					sk->sk_prot->unhash(sk);
> +				else if (sk->sk_prot == &tcp_prot)
> +					inet_put_port(sk);
> +
>  				inet->inet_saddr = inet->inet_rcv_saddr = 0;
> +				err = -EPERM;
> +
>  				goto out_release_sock;
>  			}
>  		}