Re: [PATCH bpf-next v3] bpf: bpf_{g,s}etsockopt for struct bpf_sock_addr

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Stanislav Fomichev <sdf@xxxxxxxxxx> [Fri, 2020-05-01 15:07 -0700]:
> On Fri, May 1, 2020 at 2:52 PM Andrey Ignatov <rdna@xxxxxx> wrote:
> >
> > Stanislav Fomichev <sdf@xxxxxxxxxx> [Thu, 2020-04-30 16:32 -0700]:
> > > Currently, bpf_getsockopt and bpf_setsockopt helpers operate on the
> > > 'struct bpf_sock_ops' context in BPF_PROG_TYPE_SOCK_OPS program.
> > > Let's generalize them and make them available for 'struct bpf_sock_addr'.
> > > That way, in the future, we can allow those helpers in more places.
> > >
> > > As an example, let's expose those 'struct bpf_sock_addr' based helpers to
> > > BPF_CGROUP_INET{4,6}_CONNECT hooks. That way we can override CC before the
> > > connection is made.
> > >
> > > v3:
> > > * Expose custom helpers for bpf_sock_addr context instead of doing
> > >   generic bpf_sock argument (as suggested by Daniel). Even with
> > >   try_socket_lock that doesn't sleep we have a problem where context sk
> > >   is already locked and socket lock is non-nestable.
> > >
> > > v2:
> > > * s/BPF_PROG_TYPE_CGROUP_SOCKOPT/BPF_PROG_TYPE_SOCK_OPS/
> > >
> > > Cc: John Fastabend <john.fastabend@xxxxxxxxx>
> > > Cc: Martin KaFai Lau <kafai@xxxxxx>
> > > Signed-off-by: Stanislav Fomichev <sdf@xxxxxxxxxx>
> >
> > ...
> >
> > >  SEC("cgroup/connect4")
> > >  int connect_v4_prog(struct bpf_sock_addr *ctx)
> > >  {
> > > @@ -66,6 +108,10 @@ int connect_v4_prog(struct bpf_sock_addr *ctx)
> > >
> > >       bpf_sk_release(sk);
> > >
> > > +     /* Rewrite congestion control. */
> > > +     if (ctx->type == SOCK_STREAM && set_cc(ctx))
> > > +             return 0;
> >
> > Hi Stas,
> >
> > This new check breaks one of tests in test_sock_addr:
> >
> >         root@arch-fb-vm1:/home/rdna/bpf-next/tools/testing/selftests/bpf ./test_sock_addr.sh
> >         ...
> >         (test_sock_addr.c:1199: errno: Operation not permitted) Fail to connect to server
> >         Test case: connect4: rewrite IP & TCP port .. [FAIL]
> >         ...
> >         Summary: 34 PASSED, 1 FAILED
> >
> > What the test does is it sets up TCPv4 server:
> >
> >         [pid   386] socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 6
> >         [pid   386] bind(6, {sa_family=AF_INET, sin_port=htons(4444), sin_addr=inet_addr("127.0.0.1")}, 128) = 0
> >         [pid   386] listen(6, 128)              = 0
> >
> > Then tries to connect to a fake IPv4:port and this connect4 program
> > should redirect it to that TCP server, but only if every field in
> > context has expected value.
> >
> > But after that commit program started denying the connect:
> >
> >         [pid   386] socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 7
> >         [pid   386] connect(7, {sa_family=AF_INET, sin_port=htons(4040), sin_addr=inet_addr("192.168.1.254")}, 128) = -1 EPERM (Operation not permitted)
> >         (test_sock_addr.c:1201: errno: Operation not permitted) Fail to connect to server
> >         Test case: connect4: rewrite IP & TCP port .. [FAIL]
> >
> > I verified that commenting out this new `if` fixes the problem, but
> > haven't spent time root-causing it. Could you please look at it?
> Could you please confirm that you have CONFIG_TCP_CONG_DCTCP=y in your kernel
> config? (I've added it to tools/testing/selftests/bpf/config)
> The test is now flipping CC to dctcp and back to default cubic. It can
> fail if dctcp is not compiled in.

Right. Martin asked same question and indeed my testing VM didn't have
dctcp enabled. With dctcp enabled it works fine.

I'm totally fine to keep dctcp enabled in my config or start using
tools/testing/selftests/bpf/config (I've always used my own config).

Another options can be to switch from dctcp to more widely-used reno in
the program (I tested, this works as well), or even check
net/ipv4/tcp_available_congestion_control and use a pair of whatever cc
available there, but up to you / BPF mainteiners really, as I said I'm
personally fine to just enable dctcp.

Thanks.

-- 
Andrey Ignatov



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux