Re: [PATCH net] udp: fix segmentation crash for GRO packet without fraglist

Lena Wang (王娜) <Lena.Wang@xxxxxxxxxxxx> · Tue, 23 Apr 2024 14:47:53 +0000

On Fri, 2024-04-19 at 13:41 -0400, Willem de Bruijn wrote:
>  	 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>  Maciej Żenczykowski wrote:
> > On Fri, Apr 19, 2024 at 7:17 AM Willem de Bruijn
> > <willemdebruijn.kernel@xxxxxxxxx> wrote:
> > >
> > > Lena Wang (王娜) wrote:
> > > > On Wed, 2024-04-17 at 21:15 -0700, Maciej Żenczykowski wrote:
> > > > >
> > > > > External email : Please do not click links or open
> attachments until
> > > > > you have verified the sender or the content.
> > > > >  On Wed, Apr 17, 2024 at 7:53 PM Lena Wang (王娜) <
> > > > > Lena.Wang@xxxxxxxxxxxx> wrote:
> > > > > >
> > > > > > On Wed, 2024-04-17 at 15:48 -0400, Willem de Bruijn wrote:
> > > > > > >
> > > > > > > External email : Please do not click links or open
> attachments
> > > > > until
> > > > > > > you have verified the sender or the content.
> > > > > > >  Lena Wang (王娜) wrote:
> > > > > > > > On Tue, 2024-04-16 at 19:14 -0400, Willem de Bruijn
> wrote:
> > > > > > > > >
> > > > > > > > > External email : Please do not click links or open
> > > > > attachments
> > > > > > > until
> > > > > > > > > you have verified the sender or the content.
> > > > > > > > >  > > > > Personally, I think bpf_skb_pull_data()
> should have
> > > > > > > > > automatically
> > > > > > > > > > > > > (ie. in kernel code) reduced how much it
> pulls so
> > > > > that it
> > > > > > > > > would pull
> > > > > > > > > > > > > headers only,
> > > > > > > > > > > >
> > > > > > > > > > > > That would be a helper that parses headers to
> discover
> > > > > > > header
> > > > > > > > > length.
> > > > > > > > > > >
> > > > > > > > > > > Does it actually need to?  Presumably the bpf
> pull
> > > > > function
> > > > > > > could
> > > > > > > > > > > notice that it is
> > > > > > > > > > > a packet flagged as being of type X (UDP GSO
> FRAGLIST)
> > > > > and
> > > > > > > reduce
> > > > > > > > > the pull
> > > > > > > > > > > accordingly so that it doesn't pull anything from
> the
> > > > > non-
> > > > > > > linear
> > > > > > > > > > > fraglist portion???
> > > > > > > > > > >
> > > > > > > > > > > I know only the generic overview of what udp gso
> is, not
> > > > > any
> > > > > > > > > details, so I am
> > > > > > > > > > > assuming here that there's some sort of guarantee
> to how
> > > > > > > these
> > > > > > > > > packets
> > > > > > > > > > > are structured...  But I imagine there must be or
> we
> > > > > wouldn't
> > > > > > > be
> > > > > > > > > hitting these
> > > > > > > > > > > issues deeper in the stack?
> > > > > > > > > >
> > > > > > > > > > Perhaps for a packet of this type we're already
> guaranteed
> > > > > the
> > > > > > > > > headers
> > > > > > > > > > are in the linear portion,
> > > > > > > > > > and the pull should simply be ignored?
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > Parsing is better left to the BPF program.
> > > > > > > > >
> > > > > > > > > I do prefer adding sanity checks to the BPF helpers,
> over
> > > > > having
> > > > > > > to
> > > > > > > > > add then in the net hot path only to protect against
> > > > > dangerous
> > > > > > > BPF
> > > > > > > > > programs.
> > > > > > > > >
> > > > > > > > Is it OK to ignore or decrease pull length for udp gro
> fraglist
> > > > > > > packet?
> > > > > > > > It could save the normal packet and sent to user
> correctly.
> > > > > > > >
> > > > > > > > In common/net/core/filter.c
> > > > > > > > static inline int __bpf_try_make_writable(struct
> sk_buff *skb,
> > > > > > > >               unsigned int write_len)
> > > > > > > > {
> > > > > > > > +if (skb_is_gso(skb) && (skb_shinfo(skb)->gso_type &
> > > > > > > > +(SKB_GSO_UDP  |SKB_GSO_UDP_L4)) {
> > > > > > >
> > > > > > > The issue is not with SKB_GSO_UDP_L4, but with
> SKB_GSO_FRAGLIST.
> > > > > > >
> > > > > > Current in kernel just UDP uses SKB_GSO_FRAGLIST to do GRO.
> In
> > > > > > udp_offload.c udp4_gro_complete gso_type adds
> "SKB_GSO_FRAGLIST|
> > > > > > SKB_GSO_UDP_L4". Here checking these two flags is to limit
> the
> > > > > packet
> > > > > > as "UDP + need GSO + fraglist".
> > > > > >
> > > > > > We could remove SKB_GSO_UDP_L4 check for more packet that
> may
> > > > > addrive
> > > > > > skb_segment_list.
> > > > > >
> > > > > > > > +return 0;
> > > > > > >
> > > > > > > Failing for any pull is a bit excessive. And would kill a
> sane
> > > > > > > workaround of pulling only as many bytes as needed.
> > > > > > >
> > > > > > > > +     or if (write_len > skb_headlen(skb))
> > > > > > > > +write_len = skb_headlen(skb);
> > > > > > >
> > > > > > > Truncating requests would be a surprising change of
> behavior
> > > > > > > for this function.
> > > > > > >
> > > > > > > Failing for a pull > skb_headlen is arguably reasonable,
> as
> > > > > > > the alternative is that we let it go through but have to
> drop
> > > > > > > the now malformed packets on segmentation.
> > > > > > >
> > > > > > >
> > > > > > Is it OK as below?
> > > > > >
> > > > > > In common/net/core/filter.c
> > > > > > static inline int __bpf_try_make_writable(struct sk_buff
> *skb,
> > > > > >               unsigned int write_len)
> > > > > > {
> > > > > > +       if (skb_is_gso(skb) && (skb_shinfo(skb)->gso_type &
> > > > > > +               SKB_GSO_FRAGLIST) && (write_len >
> > > > > skb_headlen(skb))) {
> > > > > > +               return 0;
> > > > >
> > > > > please limit write_len to skb_headlen() instead of just
> returning 0
> > > > >
> > > >
> > > > Hi Maze & Willem,
> > > > Maze's advice is:
> > > > In common/net/core/filter.c
> > > > static inline int __bpf_try_make_writable(struct sk_buff *skb,
> > > >               unsigned int write_len)
> > > > {
> > > > +       if (skb_is_gso(skb) && (skb_shinfo(skb)->gso_type &
> > > > +               SKB_GSO_FRAGLIST) && (write_len >
> skb_headlen(skb))) {
> > > > +               write_len = skb_headlen(skb);
> > > > +       }
> > > >         return skb_ensure_writable(skb, write_len);
> > > > }
> > > >
> > > > Willem's advice is to "Failing for a pull > skb_headlen is
> arguably
> > > > reasonable...". It prefers to return 0 :
> > > > +       if (skb_is_gso(skb) && (skb_shinfo(skb)->gso_type &
> > > > +               SKB_GSO_FRAGLIST) && (write_len >
> skb_headlen(skb))) {
> > > > +               return 0;
> > > > +       }
> > > >
> > > > It seems a bit conflict. However I am not sure if my
> understanding is
> > > > right and hope to get your further guide.
> > >
> > > I did not mean to return 0. But to fail a request that would pull
> an
> > > unsafe amount. The caller must get a clear error signal.
> > 
> > That's hostile on userspace.
> > Currently the caller doesn't even check the error return...
> 
> It can, and probably should.
> 
> bpf_skb_pull data returns the error code from bpf_try_make_writable:
> 
>    return bpf_try_make_writable(skb, len ? : skb_headlen(skb));
> 
> > Why would we?  We already have to reload all pointers, and have to
> do
> > and will thus redo checking on those.
> > 
> > What do you expect the caller to do? Subtract -1 and try again?
> > That's hard to do from BPF as it involves looping... and is slow.
> > 
> > We already try to not pull too much:
> > 
> > void try_make_writable(struct __sk_buff* skb, int len) {
> >   if (len > skb->len) len = skb->len;
> >   if (skb->data_end - skb->data < len) bpf_skb_pull_data(skb, len);
> > }
> > 
> > Is there at least something like skb->len that has the actually
> > pullable length in it?
> 
> The above snippet shows that it passes skb_headlen if the caller
> passes 0.
> 
> But your BPF program does not even need the data writable, so then
> it is of little help of course.
>  
> > Or are these skb's structured in such a way that there is never a
> need
> > to pull anything,
> > because the headers are already always in the linear portion?
> 
> That is indeed the case.
> 
> So as far as I can see:
> 
> A BPF program that just wants to pull the network and transport
> headers can diligently pull exactly what is needed. And will not
> even observe any data pulled into linear in practice. This is still
> advisable rather than trusting that the headers are linear. It may
> also be required by the validator? Don't know. But check the return
> value.
> 
Hi Willem,
As the discussion, is it OK for the patch below?

diff --git a/net/core/filter.c b/net/core/filter.c
index 3a6110ea4009..abc6029c8eef 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -1655,6 +1655,11 @@ static DEFINE_PER_CPU(struct bpf_scratchpad,
bpf_sp);
 static inline int __bpf_try_make_writable(struct sk_buff *skb,
                                          unsigned int write_len)
 {
+       if (skb_is_gso(skb) && (skb_shinfo(skb)->gso_type &
+                       SKB_GSO_FRAGLIST) && (write_len >
skb_headlen(skb))) {
+               return -ENOMEM;
+       }
+
        return skb_ensure_writable(skb, write_len);
 }

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 73b1e0e53534..2e90534c1a1e 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4036,9 +4036,11 @@ struct sk_buff *skb_segment_list(struct sk_buff
*skb,
        unsigned int tnl_hlen = skb_tnl_header_len(skb);
        unsigned int delta_truesize = 0;
        unsigned int delta_len = 0;
+       unsigned int mss = skb_shinfo(skb)->gso_size;
        struct sk_buff *tail = NULL;
        struct sk_buff *nskb, *tmp;
        int len_diff, err;
+       bool err_len = false;

        skb_push(skb, -skb_network_offset(skb) + offset);

@@ -4047,6 +4049,14 @@ struct sk_buff *skb_segment_list(struct sk_buff
*skb,
        if (err)
                goto err_linearize;

+       if (mss != GSO_BY_FRAGS && mss != skb_headlen(skb)) {
+               if (!list_skb) {
+                       goto err_linearize;
+               } else {
+                       err_len = true;
+               }
+       }
+
        skb_shinfo(skb)->frag_list = NULL;

        while (list_skb) {
@@ -4109,6 +4119,9 @@ struct sk_buff *skb_segment_list(struct sk_buff
*skb,
            __skb_linearize(skb))
                goto err_linearize;

+       if (err_len)
+               goto err_linearize;
+
        skb_get(skb);

        return skb;

> > 
> > > Back to the original report: the issue should already have been
> fixed
> > > by commit 876e8ca83667 ("net: fix NULL pointer in
> skb_segment_list").
> > > But that commit is in the kernel for which you report the error.
> > >
> > > Turns out that the crash is not in skb_segment_list, but later in
> > > __udpv4_gso_segment_list_csum. Which unconditionally dereferences
> > > udp_hdr(seg).
> > >
> > > The above fix also mentions skb pull as the culprit, but does not
> > > include a BPF program. If this can be reached in other ways, then
> we
> > > do need a stronger test in skb_segment_list, as you propose.
> > >
> > > I don't want to narrowly check whether udp_hdr is safe.
> Essentially,
> > > an SKB_GSO_FRAGLIST skb layout cannot be trusted at all if even
> one
> > > byte would get pulled.
> > 
> > --
> > Maciej Żenczykowski, Kernel Networking Developer @ Google
> 
>