On Wed, 2024-04-17 at 21:15 -0700, Maciej Żenczykowski wrote: > > External email : Please do not click links or open attachments until > you have verified the sender or the content. > On Wed, Apr 17, 2024 at 7:53 PM Lena Wang (王娜) < > Lena.Wang@xxxxxxxxxxxx> wrote: > > > > On Wed, 2024-04-17 at 15:48 -0400, Willem de Bruijn wrote: > > > > > > External email : Please do not click links or open attachments > until > > > you have verified the sender or the content. > > > Lena Wang (王娜) wrote: > > > > On Tue, 2024-04-16 at 19:14 -0400, Willem de Bruijn wrote: > > > > > > > > > > External email : Please do not click links or open > attachments > > > until > > > > > you have verified the sender or the content. > > > > > > > > > Personally, I think bpf_skb_pull_data() should have > > > > > automatically > > > > > > > > > (ie. in kernel code) reduced how much it pulls so > that it > > > > > would pull > > > > > > > > > headers only, > > > > > > > > > > > > > > > > That would be a helper that parses headers to discover > > > header > > > > > length. > > > > > > > > > > > > > > Does it actually need to? Presumably the bpf pull > function > > > could > > > > > > > notice that it is > > > > > > > a packet flagged as being of type X (UDP GSO FRAGLIST) > and > > > reduce > > > > > the pull > > > > > > > accordingly so that it doesn't pull anything from the > non- > > > linear > > > > > > > fraglist portion??? > > > > > > > > > > > > > > I know only the generic overview of what udp gso is, not > any > > > > > details, so I am > > > > > > > assuming here that there's some sort of guarantee to how > > > these > > > > > packets > > > > > > > are structured... But I imagine there must be or we > wouldn't > > > be > > > > > hitting these > > > > > > > issues deeper in the stack? > > > > > > > > > > > > Perhaps for a packet of this type we're already guaranteed > the > > > > > headers > > > > > > are in the linear portion, > > > > > > and the pull should simply be ignored? > > > > > > > > > > > > > > > > > > > > > Parsing is better left to the BPF program. > > > > > > > > > > I do prefer adding sanity checks to the BPF helpers, over > having > > > to > > > > > add then in the net hot path only to protect against > dangerous > > > BPF > > > > > programs. > > > > > > > > > Is it OK to ignore or decrease pull length for udp gro fraglist > > > packet? > > > > It could save the normal packet and sent to user correctly. > > > > > > > > In common/net/core/filter.c > > > > static inline int __bpf_try_make_writable(struct sk_buff *skb, > > > > unsigned int write_len) > > > > { > > > > +if (skb_is_gso(skb) && (skb_shinfo(skb)->gso_type & > > > > +(SKB_GSO_UDP |SKB_GSO_UDP_L4)) { > > > > > > The issue is not with SKB_GSO_UDP_L4, but with SKB_GSO_FRAGLIST. > > > > > Current in kernel just UDP uses SKB_GSO_FRAGLIST to do GRO. In > > udp_offload.c udp4_gro_complete gso_type adds "SKB_GSO_FRAGLIST| > > SKB_GSO_UDP_L4". Here checking these two flags is to limit the > packet > > as "UDP + need GSO + fraglist". > > > > We could remove SKB_GSO_UDP_L4 check for more packet that may > addrive > > skb_segment_list. > > > > > > +return 0; > > > > > > Failing for any pull is a bit excessive. And would kill a sane > > > workaround of pulling only as many bytes as needed. > > > > > > > + or if (write_len > skb_headlen(skb)) > > > > +write_len = skb_headlen(skb); > > > > > > Truncating requests would be a surprising change of behavior > > > for this function. > > > > > > Failing for a pull > skb_headlen is arguably reasonable, as > > > the alternative is that we let it go through but have to drop > > > the now malformed packets on segmentation. > > > > > > > > Is it OK as below? > > > > In common/net/core/filter.c > > static inline int __bpf_try_make_writable(struct sk_buff *skb, > > unsigned int write_len) > > { > > + if (skb_is_gso(skb) && (skb_shinfo(skb)->gso_type & > > + SKB_GSO_FRAGLIST) && (write_len > > skb_headlen(skb))) { > > + return 0; > > please limit write_len to skb_headlen() instead of just returning 0 > Hi Maze & Willem, Maze's advice is: In common/net/core/filter.c static inline int __bpf_try_make_writable(struct sk_buff *skb, unsigned int write_len) { + if (skb_is_gso(skb) && (skb_shinfo(skb)->gso_type & + SKB_GSO_FRAGLIST) && (write_len > skb_headlen(skb))) { + write_len = skb_headlen(skb); + } return skb_ensure_writable(skb, write_len); } Willem's advice is to "Failing for a pull > skb_headlen is arguably reasonable...". It prefers to return 0 : + if (skb_is_gso(skb) && (skb_shinfo(skb)->gso_type & + SKB_GSO_FRAGLIST) && (write_len > skb_headlen(skb))) { + return 0; + } It seems a bit conflict. However I am not sure if my understanding is right and hope to get your further guide. Thanks Lena > > + } > > return skb_ensure_writable(skb, write_len); > > } > > > > > > +} > > > > return skb_ensure_writable(skb, write_len); > > > > } > > > > > > > > > > > > > In this case, it would be detecting this GSO type and failing > the > > > > > operation if exceeding skb_headlen(). > > > > > > > > > > > > > > > > > and not packet content. > > > > > > > > > (This is assuming the rest of the code isn't ready to > > > deal > > > > > with a longer pull, > > > > > > > > > which I think is the case atm. Pulling too much, and > > > then > > > > > crashing or forcing > > > > > > > > > the stack to drop packets because of them being > malformed > > > > > seems wrong...) > > > > > > > > > > > > > > > > > > In general it would be nice if there was a way to > just > > > say > > > > > pull all headers... > > > > > > > > > (or possibly all L2/L3/L4 headers) > > > > > > > > > You in general need to pull stuff *before* you've > even > > > looked > > > > > at the packet, > > > > > > > > > so that you can look at the packet, > > > > > > > > > so it's relatively hard/annoying to pull the correct > > > length > > > > > from bpf > > > > > > > > > code itself. > > > > > > > > > > > > > > > > > > > > > BPF needs to modify a proper length to do pull > > > data. > > > > > However kernel > > > > > > > > > > > > should also improve the flow to avoid crash > from a > > > bpf > > > > > function > > > > > > > > > > > call. > > > > > > > > > > > > As there is no split flow and app may not > decode > > > the > > > > > merged UDP > > > > > > > > > > > packet, > > > > > > > > > > > > we should drop the packet without fraglist in > > > > > skb_segment_list > > > > > > > > > > > here. > > > > > > > > > > > > > > > > > > > > > > > > Fixes: 3a1296a38d0c ("net: Support GRO/GSO > fraglist > > > > > chaining.") > > > > > > > > > > > > Signed-off-by: Shiming Cheng < > > > > > shiming.cheng@xxxxxxxxxxxx> > > > > > > > > > > > > Signed-off-by: Lena Wang < > lena.wang@xxxxxxxxxxxx> > > > > > > > > > > > > --- > > > > > > > > > > > > net/core/skbuff.c | 3 +++ > > > > > > > > > > > > 1 file changed, 3 insertions(+) > > > > > > > > > > > > > > > > > > > > > > > > diff --git a/net/core/skbuff.c > b/net/core/skbuff.c > > > > > > > > > > > > index b99127712e67..f68f2679b086 100644 > > > > > > > > > > > > --- a/net/core/skbuff.c > > > > > > > > > > > > +++ b/net/core/skbuff.c > > > > > > > > > > > > @@ -4504,6 +4504,9 @@ struct sk_buff > > > > > *skb_segment_list(struct > > > > > > > > > > > sk_buff *skb, > > > > > > > > > > > > if (err) > > > > > > > > > > > > goto err_linearize; > > > > > > > > > > > > > > > > > > > > > > > > +if (!list_skb) > > > > > > > > > > > > +goto err_linearize; > > > > > > > > > > > > + > > > > > > > > > > > > > > > > This would catch the case where the entire data > frag_list > > > is > > > > > > > > linearized, but not a pskb_may_pull that only pulls in > part > > > of > > > > > the > > > > > > > > list. > > > > > > > > > > > > > > > > Even with BPF being privileged, the kernel should not > crash > > > if > > > > > BPF > > > > > > > > pulls a FRAGLIST GSO skb. > > > > > > > > > > > > > > > > But the check needs to be refined a bit. For a UDP GSO > > > packet, > > > > > I > > > > > > > > think gso_size is still valid, so if the head_skb > length > > > does > > > > > not > > > > > > > > match gso_size, it has been messed with and should be > > > dropped. > > > > > > > > > > > > Is it OK as below? Is it OK to add log to record the error for > easy > > > > checking issue. > > > > > > > > In net/core/skbuff.c skb_segment_list > > > > +unsigned int mss = skb_shinfo(head_skb)->gso_size; > > > > +bool err_len = false; > > > > > > > > +if ( mss != GSO_BY_FRAGS && mss != skb_headlen(head_skb)) { > > > > +pr_err("skb is dropped due to messed data. gso size:%d, > > > > +hdrlen:%d", mss, skb_headlen(head_skb) > > > > > > Such logs should always be rate limited. But no need to log cases > > > where we well understood how we get there. > > > > > > I would stick with one approach: either in the BPF func or in > > > segmentation, not both. And then I find BPF preferable, as > explained > > > before. > > > > > OK, we try make a patch in BPF func. > > > > > > +if (!list_skb) > > > > +goto err_linearize; > > > > +else > > > > +err_len = true; > > > > +} > > > > > > > > ... > > > > +if (err_len) { > > > > +goto err_linearize; > > > > +} > > > > > > > > skb_get(skb); > > > > ... > > -- > Maciej Żenczykowski, Kernel Networking Developer @ Google