Re: [PATCH bpf-next v2 4/5] bpf: allow any port in bpf_bind helper

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 06, 2020 at 09:22:45AM -0700, sdf@xxxxxxxxxx wrote:
> On 05/05, Martin KaFai Lau wrote:
> > On Tue, May 05, 2020 at 01:27:29PM -0700, Stanislav Fomichev wrote:
> > > We want to have a tighter control on what ports we bind to in
> > > the BPF_CGROUP_INET{4,6}_CONNECT hooks even if it means
> > > connect() becomes slightly more expensive. The expensive part
> > > comes from the fact that we now need to call inet_csk_get_port()
> > > that verifies that the port is not used and allocates an entry
> > > in the hash table for it.
> > >
> > > Since we can't rely on "snum || !bind_address_no_port" to prevent
> > > us from calling POST_BIND hook anymore, let's add another bind flag
> > > to indicate that the call site is BPF program.
> > >
> > > v2:
> > > * Update documentation (Andrey Ignatov)
> > > * Pass BIND_FORCE_ADDRESS_NO_PORT conditionally (Andrey Ignatov)
> > >
> > > Cc: Andrey Ignatov <rdna@xxxxxx>
> > > Signed-off-by: Stanislav Fomichev <sdf@xxxxxxxxxx>
> > > ---
> > >  include/net/inet_common.h                     |   2 +
> > >  include/uapi/linux/bpf.h                      |   9 +-
> > >  net/core/filter.c                             |  18 ++-
> > >  net/ipv4/af_inet.c                            |  10 +-
> > >  net/ipv6/af_inet6.c                           |  12 +-
> > >  tools/include/uapi/linux/bpf.h                |   9 +-
> > >  .../bpf/prog_tests/connect_force_port.c       | 104 ++++++++++++++++++
> > >  .../selftests/bpf/progs/connect_force_port4.c |  28 +++++
> > >  .../selftests/bpf/progs/connect_force_port6.c |  28 +++++
> > >  9 files changed, 192 insertions(+), 28 deletions(-)
> > >  create mode 100644
> > tools/testing/selftests/bpf/prog_tests/connect_force_port.c
> > >  create mode 100644
> > tools/testing/selftests/bpf/progs/connect_force_port4.c
> > >  create mode 100644
> > tools/testing/selftests/bpf/progs/connect_force_port6.c
> > >
> > > diff --git a/include/net/inet_common.h b/include/net/inet_common.h
> > > index c38f4f7d660a..cb2818862919 100644
> > > --- a/include/net/inet_common.h
> > > +++ b/include/net/inet_common.h
> > > @@ -39,6 +39,8 @@ int inet_bind(struct socket *sock, struct sockaddr
> > *uaddr, int addr_len);
> > >  #define BIND_FORCE_ADDRESS_NO_PORT	(1 << 0)
> > >  /* Grab and release socket lock. */
> > >  #define BIND_WITH_LOCK			(1 << 1)
> > > +/* Called from BPF program. */
> > > +#define BIND_FROM_BPF			(1 << 2)
> > >  int __inet_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len,
> > >  		u32 flags);
> > >  int inet_getname(struct socket *sock, struct sockaddr *uaddr,
> > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > index b3643e27e264..14b5518a3d5b 100644
> > > --- a/include/uapi/linux/bpf.h
> > > +++ b/include/uapi/linux/bpf.h
> > > @@ -1994,10 +1994,11 @@ union bpf_attr {
> > >   *
> > >   * 		This helper works for IPv4 and IPv6, TCP and UDP sockets. The
> > >   * 		domain (*addr*\ **->sa_family**) must be **AF_INET** (or
> > > - * 		**AF_INET6**). Looking for a free port to bind to can be
> > > - * 		expensive, therefore binding to port is not permitted by the
> > > - * 		helper: *addr*\ **->sin_port** (or **sin6_port**, respectively)
> > > - * 		must be set to zero.
> > > + * 		**AF_INET6**). It's advised to pass zero port (**sin_port**
> > > + * 		or **sin6_port**) which triggers IP_BIND_ADDRESS_NO_PORT-like
> > > + * 		behavior and lets the kernel reuse the same source port
> > Reading "zero port" and "the same source port" together is confusing.
> Ack, let me try rephrase it a bit to make it more clear.
> 
> > > + * 		as long as 4-tuple is unique. Passing non-zero port might
> > > + * 		lead to degraded performance.
> > Is the "degraded performance" also true for UDP?
> I suppose everything that is "allocating" port at bind time can lead
> to a faster port exhaustion, so UDP should be also affected.
> Although, looking at udp_v4_get_port, it looks less involved than
> its TCP counterpart.
I was mostly curious after taking a quick look in both TCP and UDP.
It does not affect the patch here.  May be something for
optimization later.

>From skimming the codes,
I was thinking UDP should be more or less the same since
they have to go through get_port() eventually.

TCP will be degraded from inet_hash_connect() [which seems
to be using the ehash and 4 tuples] to get_port().



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux