Toshiaki Makita <toshiaki.makita1@xxxxxxxxx> writes: > On 2019/11/22 20:54, Toke Høiland-Jørgensen wrote: >> Toshiaki Makita <toshiaki.makita1@xxxxxxxxx> writes: >> >>> On 2019/11/18 19:20, Toke Høiland-Jørgensen wrote: >>>> Toshiaki Makita <toshiaki.makita1@xxxxxxxxx> writes: >>>> >>>> [... trimming the context a bit ...] >>>> >>>>>>>> Take your example of TC rules: You were proposing a flow like this: >>>>>>>> >>>>>>>> Userspace TC rule -> kernel rule table -> eBPF map -> generated XDP >>>>>>>> program >>>>>>>> >>>>>>>> Whereas what I mean is that we could do this instead: >>>>>>>> >>>>>>>> Userspace TC rule -> kernel rule table >>>>>>>> >>>>>>>> and separately >>>>>>>> >>>>>>>> XDP program -> bpf helper -> lookup in kernel rule table >>>>>>> >>>>>>> Thanks, now I see what you mean. >>>>>>> You expect an XDP program like this, right? >>>>>>> >>>>>>> int xdp_tc(struct xdp_md *ctx) >>>>>>> { >>>>>>> int act = bpf_xdp_tc_filter(ctx); >>>>>>> return act; >>>>>>> } >>>>>> >>>>>> Yes, basically, except that the XDP program would need to parse the >>>>>> packet first, and bpf_xdp_tc_filter() would take a parameter struct with >>>>>> the parsed values. See the usage of bpf_fib_lookup() in >>>>>> bpf/samples/xdp_fwd_kern.c >>>>>> >>>>>>> But doesn't this way lose a chance to reduce/minimize the program to >>>>>>> only use necessary features for this device? >>>>>> >>>>>> Not necessarily. Since the BPF program does the packet parsing and fills >>>>>> in the TC filter lookup data structure, it can limit what features are >>>>>> used that way (e.g., if I only want to do IPv6, I just parse the v6 >>>>>> header, ignore TCP/UDP, and drop everything that's not IPv6). The lookup >>>>>> helper could also have a flag argument to disable some of the lookup >>>>>> features. >>>>> >>>>> It's unclear to me how to configure that. >>>>> Use options when attaching the program? Something like >>>>> $ xdp_tc attach eth0 --only-with ipv6 >>>>> But can users always determine their necessary features in advance? >>>> >>>> That's what I'm doing with xdp-filter now. But the answer to your second >>>> question is likely to be 'probably not', so it would be good to not have >>>> to do this :) >>>> >>>>> Frequent manual reconfiguration when TC rules frequently changes does >>>>> not sound nice. Or, add hook to kernel to listen any TC filter event >>>>> on some daemon and automatically reload the attached program? >>>> >>>> Doesn't have to be a kernel hook; we could enhance the userspace tooling >>>> to do it. Say we integrate it into 'tc': >>>> >>>> - Add a new command 'tc xdp_accel enable <iface> --features [ipv6,etc]' >>>> - When adding new rules, add the following logic: >>>> - Check if XDP acceleration is enabled >>>> - If it is, check whether the rule being added fits into the current >>>> 'feature set' loaded on that interface. >>>> - If the rule needs more features, reload the XDP program to one >>>> with the needed additional features. >>>> - Or, alternatively, just warn the user and let them manually >>>> replace it? >>> >>> Ok, but there are other userspace tools to configure tc in wild. >>> python and golang have their own netlink library project. >>> OVS embeds TC netlink handling code in itself. There may be more tools like this. >>> I think at least we should have rtnl notification about TC and monitor it >>> from daemon, if we want to reload the program from userspace tools. >> >> A daemon would be one way to do this in cases where it needs to be >> completely dynamic. My guess is that there are lots of environments >> where that is not required, and where a user/administrator could >> realistically specify ahead of time which feature set they want to >> enable XDP acceleration for. So in my mind the way to go about this is >> to implement the latter first, then add dynamic reconfiguration of it on >> top when (or if) it turns out to be necessary... > > Hmm, but I think there is big difference between a daemon and a cli tool. > Shouldn't we determine the design considering future usage? Sure, we should make sure the design doesn't exclude either option. But we also shouldn't end up in a "the perfect is the enemy of the good" type of situation. And the kernel-side changes are likely to be somewhat independent of what the userspace management ends up looking like... -Toke