On Wed, Mar 20, 2019 at 3:48 PM Stanislav Fomichev <sdf@xxxxxxxxxxx> wrote: > > On 03/20, Willem de Bruijn wrote: > > On Wed, Mar 20, 2019 at 3:19 PM Stanislav Fomichev <sdf@xxxxxxxxxxx> wrote: > > > > > > On 03/20, Willem de Bruijn wrote: > > > > On Wed, Mar 20, 2019 at 3:02 PM Stanislav Fomichev <sdf@xxxxxxxxxxx> wrote: > > > > > > > > > > On 03/20, Willem de Bruijn wrote: > > > > > > On Wed, Mar 20, 2019 at 12:57 PM Stanislav Fomichev <sdf@xxxxxxxxxxx> wrote: > > > > > > > > > > > > > > On 03/19, Willem de Bruijn wrote: > > > > > > > > On Tue, Mar 19, 2019 at 6:21 PM Stanislav Fomichev <sdf@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > Now that we have __flow_bpf_dissect which works on raw data (by > > > > > > > > > constructing temporary on-stack skb), use it when doing > > > > > > > > > BPF_PROG_TEST_RUN for flow dissector. > > > > > > > > > > > > > > > > > > This should help us catch any possible bugs due to missing shinfo on > > > > > > > > > the per-cpu skb. > > > > > > > > > > > > > > > > > > Note that existing __skb_flow_bpf_dissect swallows L2 headers and returns > > > > > > > > > nhoff=0, we need to preserve the existing behavior. > > > > > > > > > > > > > > > > > > Signed-off-by: Stanislav Fomichev <sdf@xxxxxxxxxx> > > > > > > > > > --- > > > > > > > > > net/bpf/test_run.c | 48 ++++++++++++++-------------------------------- > > > > > > > > > 1 file changed, 14 insertions(+), 34 deletions(-) > > > > > > > > > > > > > > > > > > > > > > > > > > @@ -300,9 +277,13 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog, > > > > > > > > > preempt_disable(); > > > > > > > > > time_start = ktime_get_ns(); > > > > > > > > > for (i = 0; i < repeat; i++) { > > > > > > > > > - retval = bpf_flow_dissect_skb(prog, skb, > > > > > > > > > - &flow_keys_dissector, > > > > > > > > > - &flow_keys); > > > > > > > > > + retval = bpf_flow_dissect(prog, data, eth->h_proto, ETH_HLEN, > > > > > > > > > + size, &flow_keys_dissector, > > > > > > > > > + &flow_keys); > > > > > > > > > + if (flow_keys.nhoff >= ETH_HLEN) > > > > > > > > > + flow_keys.nhoff -= ETH_HLEN; > > > > > > > > > + if (flow_keys.thoff >= ETH_HLEN) > > > > > > > > > + flow_keys.thoff -= ETH_HLEN; > > > > > > > > > > > > > > > > why are these conditional? > > > > > > > Hm, I didn't want these to be negative, because bpf flow program can set > > > > > > > them to zero and clamp_flow_keys makes sure they are in a "sensible" > > > > > > > range. For this particular case, I think we need to amend > > > > > > > clamp_flow_keys to make sure that flow_keys.nhoff is in the range of > > > > > > > initial_nhoff..hlen, not 0..hlen (and then we can drop these checks). > > > > > > > > > > > > So, previously eth_type_trans would call with data at the network > > > > > > header. Now it is called with data at the link layer. How would > > > > > > __skb_flow_bpf_dissect "swallows L2 headers and returns nhoff=0"? That > > > > > s/__skb_flow_bpf_dissect/eth_type_trans/, I'll clarify that in the patch > > > > > description. > > > > > > > > > > > sounds incorrect. > > > > > Previously, for skb case, eth_type_trans would pull ETH_HLEN (L2) and > > > > > after that we did skb_reset_network_header. So when later we initialized > > > > > flow keys (flow_keys->nhoff = skb_network_offset(skb)), that would > > > > > yield nhoff == 0. > > > > > > > > > > For example, see: > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/bpf/prog_tests/flow_dissector.c > > > > > > > > > > Now, we explicitly call bpf_flow_dissect with nhoff=ETH_HLEN and have to > > > > > undo it, otherwise, it breaks those tests. > > > > > > > > > > We could do something like the following instead: > > > > > retval = bpf_flow_dissect(prog, data + ETH_HLEN, eth->h_proto, 0, > > > > > size, &flow_keys_dissector, > > > > > &flow_keys); > > > > > > > > > > But I wanted to make sure nhoff != 0 works. > > > > > > > > Makes sense. Ensuring that nhoff lies within initial_nhoff..hlen > > > > sounds correct to me. But this is a limitation of the test, so should > > > > be in the test logic, not in the generic clamp code. Perhaps just fail > > > > the test if returned nhoff < ETH_HLEN? > > > I don't think it's only about the tests. BPF program can return > > > nhoff/thoff out of range as well (if there was some bug in its logic, > > > for example). We should not blindly trust whatever it returns, right? > > > > Definitely. That's why we clamp. I'm not sure that we have to restrict > > the minimum offset to initial nhoff, however. > Makes sense. TBH, only the tests currently care about nhoff that flow > dissector returns. In the kernel we use only thoff from bpf flow dissector > and ignore any modifications to the nhoff. > > Do you think there is a usecase for nhoff possibly going backwards? > In other words, why not prohibit that from the beginning and set the > expectations strait (i.e. nhoff only grows). Fair point. That is how the non bpf flow dissector works. And if the initial offset is always sensible, indeed I see no reasonable case where the program would return a lower value. I was a bit concerned about that precondition. In practice, all but one caller passes 0 and data at network header, where this discussion is moot. And the exception is eth_get_headlen which hardcodes ETH_HLEN. Given that, y our original suggestion to adjust the clamp function SGTM.