Re: [PATCH bpf-next 2/2] bpf: Require only one of cong_avoid() and cong_control() from a TCP CC

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2022-06-09 at 11:55 -0700, Martin KaFai Lau wrote:
> On Thu, Jun 09, 2022 at 10:55:25AM +0200, Jörn-Thorben Hinz wrote:
> > Thanks for the feedback, Martin.
> > 
> > On Wed, 2022-06-08 at 11:33 -0700, Martin KaFai Lau wrote:
> > > On Wed, Jun 08, 2022 at 07:48:43PM +0200, Jörn-Thorben Hinz
> > > wrote:
> > > > When a CC implements tcp_congestion_ops.cong_control(), the
> > > > alternate
> > > > cong_avoid() is not in use in the TCP stack. Do not force a BPF
> > > > CC
> > > > to
> > > > implement cong_avoid() as a no-op by always requiring it.
> > > > 
> > > > An incomplete BPF CC implementing neither cong_avoid() nor
> > > > cong_control() will still get rejected by
> > > > tcp_register_congestion_control().
> > > > 
> > > > Signed-off-by: Jörn-Thorben Hinz <jthinz@xxxxxxxxxxxxxxxxxxxx>
> > > > ---
> > > >  net/ipv4/bpf_tcp_ca.c | 1 +
> > > >  1 file changed, 1 insertion(+)
> > > > 
> > > > diff --git a/net/ipv4/bpf_tcp_ca.c b/net/ipv4/bpf_tcp_ca.c
> > > > index 1f5c53ede4e5..37290d0bf134 100644
> > > > --- a/net/ipv4/bpf_tcp_ca.c
> > > > +++ b/net/ipv4/bpf_tcp_ca.c
> > > > @@ -17,6 +17,7 @@ extern struct bpf_struct_ops
> > > > bpf_tcp_congestion_ops;
> > > >  static u32 optional_ops[] = {
> > > >         offsetof(struct tcp_congestion_ops, init),
> > > >         offsetof(struct tcp_congestion_ops, release),
> > > > +       offsetof(struct tcp_congestion_ops, cong_avoid),
> > > At least one of the cong_avoid() or cong_control() is needed.
> > > It is better to remove is_optional(moff) check and its
> > > optional_ops[]
> > > here.  Only depends on the tcp_register_congestion_control()
> > > which
> > > does a similar check at the beginning.
> > You mean completely remove this part of the validation from
> > bpf_tcp_ca.c and just rely on tcp_register_congestion_control()?
> > True,
> Yes.
> 
> > that would be even easier to maintain at this point, make
> > tcp_register_congestion_control() the one-and-only place that has
> > to
> > know about required and optional functions.
> > 
> > Will rework the second patch.
> > 
> > > 
> > > Patch 1 looks good.  tcp_bbr.c also needs the sk_pacing fields.
> > > 
> > > A selftest is needed.  Can you share your bpf tcp-cc and
> > > use it as a selftest to exercise the change in this patch
> > > set ?
> > I cannot do that just now, unfortunately. It’s still earlier work
> > in
> > progress. Also, it will have an additional, external dependency
> > which
> > might make it unfit to be included here/as a selftest. I will keep
> > it
> > in mind for later this year, though.
> What is the external dependency ?  Could you share some high level
> of the CC you are developing ?
> The reason for this question is to see if there is something
> missing from the kernel side to write the tcp-cc in bpf that you
> are developing.
The algorithm is PowerTCP[1], I’m currently implementing it with eBPF.
It requires telemetry from the network.

The mentioned dependency (not developed by us, not yet released) will
provide such telemetry. Its end-host part is implemented with eBPF
itself and does not require any additional kernel patches.

At the moment I’m not aware of any other bits missing from the kernel
side. Will propose another patch if that changes or at least report it.

[1] https://www.usenix.org/system/files/nsdi22-paper-addanki_3.pdf

> 
> > In the meantime, I could look into adding a more naive/trivial
> > test,
> > that implements cong_control() without cong_avoid() and relies on
> > sk_pacing_* being writable, if you would prefer that? Would that be
> > fine as a follow-up patch (might take me a moment) or better be
> > included in this series?
> Yeah, it will do and the test should be submitted together in
> this series.
Please see v3 of the series.




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux