Re: [Patch bpf] bpf: check negative offsets in __bpf_skb_min_len()

Cong Wang <xiyou.wangcong@xxxxxxxxx> · Sat, 26 Oct 2024 08:33:31 -0700

On Tue, Oct 22, 2024 at 10:52:31PM +0200, Daniel Borkmann wrote:
> On 10/8/24 7:33 AM, Cong Wang wrote:
> > From: Cong Wang <cong.wang@xxxxxxxxxxxxx>
> > 
> > skb_transport_offset() and skb_transport_offset() can be negative when
> 
> nit: I presume the 2nd one is skb_network_offset?
> 
> > they are called after we pull the transport header, for example, when
> > we use eBPF sockmap (aka at the point of ->sk_data_ready()).
> > 
> > __bpf_skb_min_len() uses an unsigned int to get these offsets, this
> > leads to a very large number which causes bpf_skb_change_tail() failed
> > unexpectedly.
> > 
> > Fix this by using a signed int to get these offsets and test them
> > against zero.
> > 
> > Fixes: 5293efe62df8 ("bpf: add bpf_skb_change_tail helper")
> > Cc: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
> > Signed-off-by: Cong Wang <cong.wang@xxxxxxxxxxxxx>
> 
> Is there any chance you could also extend the sockmap BPF selftest with
> this case you're hitting so that BPF CI can run this regularly?

Yes, my colleague Zijian (Cc'ed) is working on a selftest to cover this case.

Please let me know if you prefer to send it together with the selftest,
technically it would make backporting this fix harder, but I am open to
any suggestion here.

> 
> > ---
> >   net/core/filter.c | 21 +++++++++++++++------
> >   1 file changed, 15 insertions(+), 6 deletions(-)
> > 
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index 4e3f42cc6611..10ef27639a5d 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -3737,13 +3737,22 @@ static const struct bpf_func_proto bpf_skb_adjust_room_proto = {
> >   static u32 __bpf_skb_min_len(const struct sk_buff *skb)
> >   {
> > -	u32 min_len = skb_network_offset(skb);
> > +	int offset = skb_network_offset(skb);
> > +	u32 min_len = 0;
> > -	if (skb_transport_header_was_set(skb))
> > -		min_len = skb_transport_offset(skb);
> > -	if (skb->ip_summed == CHECKSUM_PARTIAL)
> > -		min_len = skb_checksum_start_offset(skb) +
> > -			  skb->csum_offset + sizeof(__sum16);
> > +	if (offset > 0)
> > +		min_len = offset;
> > +	if (skb_transport_header_was_set(skb)) {
> > +		offset = skb_transport_offset(skb);
> > +		if (offset > 0)
> > +			min_len = offset;
> > +	}
> > +	if (skb->ip_summed == CHECKSUM_PARTIAL) {
> > +		offset = skb_checksum_start_offset(skb) +
> > +			 skb->csum_offset + sizeof(__sum16);
> > +		if (offset > 0)
> > +			min_len = offset;
> > +	}
> >   	return min_len;
> 
> I'll let John chime in, but does this mean in case of sockmap min_len always ends
> up at 0? I just wonder whether we should pass a custom __bpf_skb_min_len to
> __bpf_skb_change_tail for bpf_skb_change_tail vs sk_skb_change_tail assuming the
> compiler is able to inlining all this (instead of indirect call).

Yes, in case of sockmap skb->data is already past TCP header, so all the
offsets here are negative. And since the 'new_len' of bpf_skb_change_tail()
is unsigned (too late to change), min_len should be zero.

Thanks.