Re: [RFC bpf-next v2 7/9] bpf: when doing BPF_PROG_TEST_RUN for flow dissector use no-skb mode

Willem de Bruijn <willemdebruijn.kernel@xxxxxxxxx> · Wed, 20 Mar 2019 15:23:00 -0400

On Wed, Mar 20, 2019 at 3:19 PM Stanislav Fomichev <sdf@xxxxxxxxxxx> wrote:
>
> On 03/20, Willem de Bruijn wrote:
> > On Wed, Mar 20, 2019 at 3:02 PM Stanislav Fomichev <sdf@xxxxxxxxxxx> wrote:
> > >
> > > On 03/20, Willem de Bruijn wrote:
> > > > On Wed, Mar 20, 2019 at 12:57 PM Stanislav Fomichev <sdf@xxxxxxxxxxx> wrote:
> > > > >
> > > > > On 03/19, Willem de Bruijn wrote:
> > > > > > On Tue, Mar 19, 2019 at 6:21 PM Stanislav Fomichev <sdf@xxxxxxxxxx> wrote:
> > > > > > >
> > > > > > > Now that we have __flow_bpf_dissect which works on raw data (by
> > > > > > > constructing temporary on-stack skb), use it when doing
> > > > > > > BPF_PROG_TEST_RUN for flow dissector.
> > > > > > >
> > > > > > > This should help us catch any possible bugs due to missing shinfo on
> > > > > > > the per-cpu skb.
> > > > > > >
> > > > > > > Note that existing __skb_flow_bpf_dissect swallows L2 headers and returns
> > > > > > > nhoff=0, we need to preserve the existing behavior.
> > > > > > >
> > > > > > > Signed-off-by: Stanislav Fomichev <sdf@xxxxxxxxxx>
> > > > > > > ---
> > > > > > >  net/bpf/test_run.c | 48 ++++++++++++++--------------------------------
> > > > > > >  1 file changed, 14 insertions(+), 34 deletions(-)
> > > > > > >
> > > > > >
> > > > > > > @@ -300,9 +277,13 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
> > > > > > >         preempt_disable();
> > > > > > >         time_start = ktime_get_ns();
> > > > > > >         for (i = 0; i < repeat; i++) {
> > > > > > > -               retval = bpf_flow_dissect_skb(prog, skb,
> > > > > > > -                                             &flow_keys_dissector,
> > > > > > > -                                             &flow_keys);
> > > > > > > +               retval = bpf_flow_dissect(prog, data, eth->h_proto, ETH_HLEN,
> > > > > > > +                                         size, &flow_keys_dissector,
> > > > > > > +                                         &flow_keys);
> > > > > > > +               if (flow_keys.nhoff >= ETH_HLEN)
> > > > > > > +                       flow_keys.nhoff -= ETH_HLEN;
> > > > > > > +               if (flow_keys.thoff >= ETH_HLEN)
> > > > > > > +                       flow_keys.thoff -= ETH_HLEN;
> > > > > >
> > > > > > why are these conditional?
> > > > > Hm, I didn't want these to be negative, because bpf flow program can set
> > > > > them to zero and clamp_flow_keys makes sure they are in a "sensible"
> > > > > range. For this particular case, I think we need to amend
> > > > > clamp_flow_keys to make sure that flow_keys.nhoff is in the range of
> > > > > initial_nhoff..hlen, not 0..hlen (and then we can drop these checks).
> > > >
> > > > So, previously eth_type_trans would call with data at the network
> > > > header. Now it is called with data at the link layer. How would
> > > > __skb_flow_bpf_dissect "swallows L2 headers and returns nhoff=0"? That
> > > s/__skb_flow_bpf_dissect/eth_type_trans/, I'll clarify that in the patch
> > > description.
> > >
> > > > sounds incorrect.
> > > Previously, for skb case, eth_type_trans would pull ETH_HLEN (L2) and
> > > after that we did skb_reset_network_header. So when later we initialized
> > > flow keys (flow_keys->nhoff = skb_network_offset(skb)), that would
> > > yield nhoff == 0.
> > >
> > > For example, see:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
> > >
> > > Now, we explicitly call bpf_flow_dissect with nhoff=ETH_HLEN and have to
> > > undo it, otherwise, it breaks those tests.
> > >
> > > We could do something like the following instead:
> > > retval = bpf_flow_dissect(prog, data + ETH_HLEN, eth->h_proto, 0,
> > >                           size, &flow_keys_dissector,
> > >                           &flow_keys);
> > >
> > > But I wanted to make sure nhoff != 0 works.
> >
> > Makes sense. Ensuring that nhoff lies within initial_nhoff..hlen
> > sounds correct to me. But this is a limitation of the test, so should
> > be in the test logic, not in the generic clamp code. Perhaps just fail
> > the test if returned nhoff < ETH_HLEN?
> I don't think it's only about the tests. BPF program can return
> nhoff/thoff out of range as well (if there was some bug in its logic,
> for example). We should not blindly trust whatever it returns, right?

Definitely. That's why we clamp. I'm not sure that we have to restrict
the minimum offset to initial nhoff, however.