Hi Phil, On Thu, Oct 31, 2019 at 06:19:47PM +0100, Phil Sutter wrote: > On Thu, Oct 31, 2019 at 04:02:34PM +0100, Pablo Neira Ayuso wrote: > > On Thu, Oct 24, 2019 at 06:37:05PM +0200, Phil Sutter wrote: > > > This series speeds up xtables-restore calls with --noflush (typically > > > used to batch a few commands for faster execution) by preliminary input > > > inspection. > > > > > > Before, setting --noflush flag would inevitably lead to full cache > > > population. With this series in place, if input can be fully buffered > > > and no commands requiring full cache is contained, no initial cache > > > population happens and each rule parsed will cause fetching of cache > > > bits as required. > > > > > > The input buffer size is arbitrarily chosen to be 64KB. > > > > > > Patches one and two prepare code for patch three which moves the loop > > > content parsing each line of input into a separate function. The > > > reduction of code indenting is used by patch four which deals with > > > needless line breaks. > > > > For patches from 1 to 4 in this batch: > > > > Acked-by: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> > > > > > Patch five deals with another requirement of input buffering, namely > > > stripping newline characters from each line. This is not a problem by > > > itself, but add_param_to_argv() replaces them by nul-chars and so > > > strings stop being consistently terminated (some by a single, some by > > > two nul-chars). > > > > > > Patch six then finally adds the buffering and caching decision code. > > > > > > Patch seven is pretty unrelated but tests a specific behaviour of > > > *tables-restore I wasn't sure of at first. > > > > Do you have any number? > > Yes, I wrote a small benchmark based on some Kubernetes use-case. It > measures loading of dumps like: > > | *nat > | :KUBE-SVC-23 - [0:0] > | :KUBE-SEP-23 - [0:0] > | -A KUBE-HOOK ! -s 10.128.0.0/14 -d 172.30.108.136/32 -p tcp -m comment --comment \"openshift-controller-manager/controller-manager:https cluster IP\" -m tcp --dport 443 -j KUBE-MARK-MASQ > | -A KUBE-HOOK -d 172.30.108.136/32 -p tcp -m comment --comment \"openshift-controller-manager/controller-manager:https cluster IP\" -m tcp --dport 443 -j KUBE-SVC-23 > | -A KUBE-SVC-23 -j KUBE-SEP-23 > | -A KUBE-SEP-23 -s 10.128.0.38/32 -j KUBE-MARK-MASQ > | -A KUBE-SEP-23 -p tcp -m tcp -j DNAT --to-destination 10.128.0.38:8443 > | COMMIT > > Into a ruleset with increasing size (created by repeating the snippet above): > > size (*100) | legacy | nft-pre | nft-post > --------------------------------------------------------- > 1 .0040366426 .0079313714 .0025598650 > 10 .0146918664 .0459193868 .0025134858 > 25 .0361553334 .1195503778 .0024202904 > 50 .0699177362 .2547542626 .0024351612 > 75 .1062593206 .4078182120 .0024362044 > 100 .1614045514 .5636617378 .0024195190 Thanks, this is nice. I can see this function: static bool cmd_needs_full_cache(char *cmd) the pre-parsing of the input to calculate the cache, which is good. One thing: why do you need the conversion from \n to \0. The idea is to read once from the file and keep it in a buffer, then pass it to the original parsing function after this pre-parsing to calculate the cache. Please, add this to the remaining patches of this series. Acked-by: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> Thanks.