Re: [bpf PATCH 2/2] bpf, sockmap: fix incorrect fwd_alloc accounting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 24, 2021 at 7:46 PM John Fastabend <john.fastabend@xxxxxxxxx> wrote:
>
> Cong Wang wrote:
> > On Wed, Mar 24, 2021 at 2:00 PM John Fastabend <john.fastabend@xxxxxxxxx> wrote:
> > >
> > > Incorrect accounting fwd_alloc can result in a warning when the socket
> > > is torn down,
> > >
>
> [...]
>
> > > To resolve lets only account for sockets on the ingress queue that are
> > > still associated with the current socket. On the redirect case we will
> > > check memory limits per 6fa9201a89898, but will omit fwd_alloc accounting
> > > until skb is actually enqueued. When the skb is sent via skb_send_sock_locked
> > > or received with sk_psock_skb_ingress memory will be claimed on psock_other.
>                      ^^^^^^^^^^^^^^^^^^^^
> >
> > You mean sk_psock_skb_ingress(), right?
>
> Yes.

skb_send_sock_locked() actually allocates its own skb when sending, hence
it uses a different skb for memory accounting.

>
> [...]
>
> > > @@ -880,12 +876,13 @@ static void sk_psock_strp_read(struct strparser *strp, struct sk_buff *skb)
> > >                 kfree_skb(skb);
> > >                 goto out;
> > >         }
> > > -       skb_set_owner_r(skb, sk);
> > >         prog = READ_ONCE(psock->progs.skb_verdict);
> > >         if (likely(prog)) {
> > > +               skb->sk = psock->sk;
> >
> > Why is skb_orphan() not needed here?
>
> These come from strparser which do not have skb->sk set.

Hmm, but sk_psock_verdict_recv() passes a clone too, like
strparser, so either we need it for both, or not at all. Clones
do not have skb->sk, so I think you can remove the one in
sk_psock_verdict_recv() too.


>
> >
> > Nit: You can just use 'sk' here, so "skb->sk = sk".
>
> Sure that is a bit nicer, will respin with this.
>
> >
> >
> > >                 tcp_skb_bpf_redirect_clear(skb);
> > >                 ret = sk_psock_bpf_run(psock, prog, skb);
> > >                 ret = sk_psock_map_verd(ret, tcp_skb_bpf_redirect_fetch(skb));
> > > +               skb->sk = NULL;
> >
> > Why do you want to set it to NULL here?
>
> So we don't cause the stack to throw other errors later if we
> were to call skb_orphan for example. Various places in the skb
> helpers expect both skb->sk and skb->destructor to be set together
> and here we are just using it as a mechanism to feed the sk into
> the BPF program side. The above skb_set_owner_r for example
> would likely BUG().

Sounds reasonable.

Thanks.



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux