On 19/08/17 (土) 3:52:24, Jakub Kicinski wrote:
On Fri, 16 Aug 2019 10:28:10 +0900, Toshiaki Makita wrote:
On 2019/08/16 4:22, Jakub Kicinski wrote:
There's a certain allure in bringing the in-kernel BPF translation
infrastructure forward. OTOH from system architecture perspective IMHO
it does seem like a task best handed in user space. bpfilter can replace
iptables completely, here we're looking at an acceleration relatively
loosely coupled with flower.
I don't think it's loosely coupled. Emulating TC behavior in userspace
is not so easy.
Think about recent multi-mask support in flower. Previously userspace could
assume there is one mask and hash table for each preference in TC. After the
change TC accepts different masks with the same pref. Such a change tends to
break userspace emulation. It may ignore masks passed from flow insertion
and use the mask remembered when the first flow of the pref is inserted. It
may override the mask of all existing flows with the pref. It may fail to
insert such flows. Any of them would result in unexpected wrong datapath
handling which is critical.
I think such an emulation layer needs to be updated in sync with TC.
Oh, so you're saying that if xdp_flow is merged all patches to
cls_flower and netfilter which affect flow offload will be required
to update xdp_flow as well?
Hmm... you are saying that we are allowed to break other in-kernel
subsystem by some change? Sounds strange...
That's a question of policy. Technically the implementation in user
space is equivalent.
>
The advantage of user space implementation is that you can add more
to it and explore use cases which do not fit in the flow offload API,
but are trivial for BPF. Not to mention the obvious advantage of
decoupling the upgrade path.
I understand the advantage, but I can't trust such a third-party kernel
emulation solution for this kind of thing which handles critical data path.
Personally I'm not happy with the way this patch set messes with the
flow infrastructure. You should use the indirect callback
infrastructure instead, and that way you can build the whole thing
touching none of the flow offload core.
I don't want to mess up the core flow infrastructure either. I'm all
ears about less invasive ways. Using indirect callback sounds like a
good idea. Will give it a try. Many thanks.
Toshiaki Makita