Re: [iptables PATCH v3 0/7] Improve xtables-restore performance

Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> · Wed, 6 Nov 2019 10:24:52 +0100

Hi Phil,

On Thu, Oct 31, 2019 at 06:19:47PM +0100, Phil Sutter wrote:
> On Thu, Oct 31, 2019 at 04:02:34PM +0100, Pablo Neira Ayuso wrote:
> > On Thu, Oct 24, 2019 at 06:37:05PM +0200, Phil Sutter wrote:
> > > This series speeds up xtables-restore calls with --noflush (typically
> > > used to batch a few commands for faster execution) by preliminary input
> > > inspection.
> > > 
> > > Before, setting --noflush flag would inevitably lead to full cache
> > > population. With this series in place, if input can be fully buffered
> > > and no commands requiring full cache is contained, no initial cache
> > > population happens and each rule parsed will cause fetching of cache
> > > bits as required.
> > > 
> > > The input buffer size is arbitrarily chosen to be 64KB.
> > > 
> > > Patches one and two prepare code for patch three which moves the loop
> > > content parsing each line of input into a separate function. The
> > > reduction of code indenting is used by patch four which deals with
> > > needless line breaks.
> > 
> > For patches from 1 to 4 in this batch:
> > 
> > Acked-by: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx>
> > 
> > > Patch five deals with another requirement of input buffering, namely
> > > stripping newline characters from each line. This is not a problem by
> > > itself, but add_param_to_argv() replaces them by nul-chars and so
> > > strings stop being consistently terminated (some by a single, some by
> > > two nul-chars).
> > > 
> > > Patch six then finally adds the buffering and caching decision code.
> > > 
> > > Patch seven is pretty unrelated but tests a specific behaviour of
> > > *tables-restore I wasn't sure of at first.
> > 
> > Do you have any number?
> 
> Yes, I wrote a small benchmark based on some Kubernetes use-case. It
> measures loading of dumps like:
> 
> | *nat
> | :KUBE-SVC-23 - [0:0]
> | :KUBE-SEP-23 - [0:0]
> | -A KUBE-HOOK ! -s 10.128.0.0/14 -d 172.30.108.136/32 -p tcp -m comment --comment \"openshift-controller-manager/controller-manager:https cluster IP\" -m tcp --dport 443 -j KUBE-MARK-MASQ
> | -A KUBE-HOOK -d 172.30.108.136/32 -p tcp -m comment --comment \"openshift-controller-manager/controller-manager:https cluster IP\" -m tcp --dport 443 -j KUBE-SVC-23
> | -A KUBE-SVC-23 -j KUBE-SEP-23
> | -A KUBE-SEP-23 -s 10.128.0.38/32 -j KUBE-MARK-MASQ
> | -A KUBE-SEP-23 -p tcp -m tcp -j DNAT --to-destination 10.128.0.38:8443
> | COMMIT
> 
> Into a ruleset with increasing size (created by repeating the snippet above):
> 
> size (*100) |	legacy     |   nft-pre	   |   nft-post
> ---------------------------------------------------------
> 1             .0040366426     .0079313714     .0025598650
> 10            .0146918664     .0459193868     .0025134858
> 25            .0361553334     .1195503778     .0024202904
> 50            .0699177362     .2547542626     .0024351612
> 75            .1062593206     .4078182120     .0024362044
> 100           .1614045514     .5636617378     .0024195190

Thanks, this is nice.

I can see this function:

        static bool cmd_needs_full_cache(char *cmd)

the pre-parsing of the input to calculate the cache, which is good.

One thing: why do you need the conversion from \n to \0. The idea is
to read once from the file and keep it in a buffer, then pass it to
the original parsing function after this pre-parsing to calculate the
cache.

Please, add this to the remaining patches of this series.

Acked-by: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx>

Thanks.