On Fri, 30 Jun 2023 at 16:58, Ilya Maximets <i.maximets@xxxxxxx> wrote: > > Initial creation of an AF_XDP socket requires CAP_NET_RAW capability. > A privileged process might create the socket and pass it to a > non-privileged process for later use. However, that process will be > able to bind the socket to any network interface. Even though it will > not be able to receive any traffic without modification of the BPF map, > the situation is not ideal. > > Sockets already have a mechanism that can be used to restrict what > interface they can be attached to. That is SO_BINDTODEVICE. > > To change the binding the process will need CAP_NET_RAW. > > Make xsk_bind() honor the SO_BINDTODEVICE in order to allow safer > workflow when non-privileged process is using AF_XDP. Rebinding an AF_XDP socket is not allowed today. Any such attempt will return an error from bind. So if I understand the purpose of SO_BINDTODEVICE correctly, you could say that this option is always set for an AF_XDP socket and it is not possible to toggle it. The only way to "rebind" an AF_XDP socket is to close it and open a new one. This was a conscious design decision from day one as it would be very hard to support this, especially in zero-copy mode. > Signed-off-by: Ilya Maximets <i.maximets@xxxxxxx> > --- > > Posting as an RFC for now to probably get some feedback. > Will re-post once the tree is open. > > Documentation/networking/af_xdp.rst | 9 +++++++++ > net/xdp/xsk.c | 6 ++++++ > 2 files changed, 15 insertions(+) > > diff --git a/Documentation/networking/af_xdp.rst b/Documentation/networking/af_xdp.rst > index 247c6c4127e9..1cc35de336a4 100644 > --- a/Documentation/networking/af_xdp.rst > +++ b/Documentation/networking/af_xdp.rst > @@ -433,6 +433,15 @@ start N bytes into the buffer leaving the first N bytes for the > application to use. The final option is the flags field, but it will > be dealt with in separate sections for each UMEM flag. > > +SO_BINDTODEVICE setsockopt > +-------------------------- > + > +This is a generic SOL_SOCKET option that can be used to tie AF_XDP > +socket to a particular network interface. It is useful when a socket > +is created by a privileged process and passed to a non-privileged one. > +Once the option is set, kernel will refuse attempts to bind that socket > +to a different interface. Updating the value requires CAP_NET_RAW. > + > XDP_STATISTICS getsockopt > ------------------------- > > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c > index 5a8c0dd250af..386ff641db0f 100644 > --- a/net/xdp/xsk.c > +++ b/net/xdp/xsk.c > @@ -886,6 +886,7 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len) > struct sock *sk = sock->sk; > struct xdp_sock *xs = xdp_sk(sk); > struct net_device *dev; > + int bound_dev_if; > u32 flags, qid; > int err = 0; > > @@ -899,6 +900,11 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len) > XDP_USE_NEED_WAKEUP)) > return -EINVAL; > > + bound_dev_if = READ_ONCE(sk->sk_bound_dev_if); > + > + if (bound_dev_if && bound_dev_if != sxdp->sxdp_ifindex) > + return -EINVAL; > + > rtnl_lock(); > mutex_lock(&xs->mutex); > if (xs->state != XSK_READY) { > -- > 2.40.1 > >