Re: [PATCH nf-next 00/19] netfilter: nftables: dscp modification offload

Boris Sukholitko <boris.sukholitko@xxxxxxxxxxxx> · Tue, 9 May 2023 17:56:58 +0300

On Mon, May 08, 2023 at 10:07:40PM +0200, Pablo Neira Ayuso wrote:
> On Mon, May 08, 2023 at 04:38:06PM +0300, Boris Sukholitko wrote:
> > On Sun, May 07, 2023 at 07:37:58PM +0200, Florian Westphal wrote:
> > > Boris Sukholitko <boris.sukholitko@xxxxxxxxxxxx> wrote:
> > > > On Wed, May 3, 2023 at 9:46 PM Florian Westphal <fw@xxxxxxxxx> wrote:
> > > > >
> > > > > Boris Sukholitko <boris.sukholitko@xxxxxxxxxxxx> wrote:
> > > > [... snip to non working offload ...]
> > > > 
> > > > > > table inet filter {
> > > > > >         flowtable f1 {
> > > > > >                 hook ingress priority filter
> > > > > >                 devices = { veth0, veth1 }
> > > > > >         }
> > > > > >
> > > > > >         chain forward {
> > > > > >                 type filter hook forward priority filter; policy accept;
> > > > > >                 ip dscp set cs3 offload
> > > > > >                 ip protocol { tcp, udp, gre } flow add @f1
> > > > > >                 ct state established,related accept
> > > > > >         }
> > > > > > }
> > > > 
> > > > [...]
> > > > 
> > > > >
> > > > > I wish you would have reported this before you started to work on
> > > > > this, because this is not a bug, this is expected behaviour.
> > > > >
> > > > > Once you offload, the ruleset is bypassed, this is by design.
> > > > 
> > > > From the rules UI perspective it seems possible to accelerate
> > > > forward chain handling with the statements such as dscp modification there.
> > > > 
> > > > Isn't it better to modify the packets according to the bypassed
> > > > ruleset thus making the behaviour more consistent?
> > > 
> > > The behaviour is consistent.  Once flow is offloaded, ruleset is
> > > bypassed.  Its easy to not offload those flows that need the ruleset.
> > > 
> > > > > Lets not make the software offload more complex as it already is.
> > > > 
> > > > Could you please tell which parts of software offload are too complex?
> > > > It's not too bad from what I've seen :)
> > > > 
> > > > This patch series adds 56 lines of code in the new nf_conntrack.ext.c
> > > > file. 20 of them (nf_flow_offload_apply_payload) are used in
> > > > the software fast path. Is it too high of a price?
> > > 
> > > 56 lines of code *now*.
> > > 
> > > Next someone wants to call into sets/maps for named counters that
> > > they need.  Then someone wants limit or quota to work.  Then they want fib
> > > for RPF.  Then xfrm policy matching to augment acccounting.
> > > This will go on until we get to the point where removing "fast" path
> > > turns into a performance optimization.
> > 
> > OK. May I assume that you are concerned with the eventual performance impact
> > on the software fast path (i.e. nf_flow_offload_ip_hook)?
> 
> I think Florian's concern is that there is better infrastructure to
> handle for ruleset offloads, ie. ingress/egress ruleset hardware
> offload infrastructure.
> 

I fully agree that ingress and egress chains are better for hardware
offload.

But what about software only platform? Having additional ingress or
egress chains means going through nftables VM in addition to the forward
chain. With the usual "premature optimization" disclaimers, IMHO we can
get better performance by doing only forward table software fast path.

> > Obviously the performance of the fast path is very important to our
> > customers. Otherwise they would not be requiring dscp fast path
> > modification. :)
> > 
> > One of the things we've thought about regarding the fast path
> > performance is rewriting nf_flow_offload_ip_hook to work with
> > nf_flowtable->flow_block instead of flow_offload_tuple.
> > 
> > We hope that iterating over flow_action_entry list similar to what the
> > hardware acceleration does, will be more efficient also in software.
> > 
> > Nice side-effect of such optimization would be that the amount of
> > feature bloat (such as dscp modification!) will not affect your typical
> > connection unless the user actually uses them.
> > 
> > For example, for dscp payload modification we'll generate
> > FLOW_ACTION_MANGLE entry. This entry will appear on flow_block's of
> > the only connections which require it. Others will be uneffected.
> >
> > Would you be ok with such direction (with performance tests of
> > course)?
> 
> I am still missing the reason why the ingress/egress ruleset hardware
> offload infrastructure is not a good fit for your requirements.

Because our current target is maximizing performance of the software fast
path only. We need to have our customer nftables rulesets running as
fast as possible on their hardware. For this, software fast path seems
like ideal solution. Maybe with a bit of work ... :)

Thanks,
Boris.
Attachment:
smime.p7s

Description: S/MIME Cryptographic Signature