On Fri, Jul 31, 2020 at 01:21:34PM +0200, Phil Sutter wrote: > Hi Pablo, > > On Thu, Jul 30, 2020 at 09:25:54PM +0200, Pablo Neira Ayuso wrote: > > On Thu, Jul 30, 2020 at 03:57:10PM +0200, Phil Sutter wrote: > > > The full list of tables in kernel is not relevant, only those used by > > > iptables-nft and for those, knowing if they exist or not is sufficient. > > > For holding that information, the already existing 'table' array in > > > nft_cache suits well. > > > > > > Consequently, nft_table_find() merely checks if the new 'exists' boolean > > > is true or not and nft_for_each_table() iterates over the builtin_table > > > array in nft_handle, additionally checking the boolean in cache for > > > whether to skip the entry or not. > > > > > > Signed-off-by: Phil Sutter <phil@xxxxxx> > > > --- > > > iptables/nft-cache.c | 73 +++++++++++--------------------------------- > > > iptables/nft-cache.h | 9 ------ > > > iptables/nft.c | 55 +++++++++------------------------ > > > iptables/nft.h | 2 +- > > > 4 files changed, 34 insertions(+), 105 deletions(-) > > > > This diffstat looks interesting :-) > > As promised, I wanted to leverage your change for further optimization, > but ended up optimizing your code out along with the old one. :D > > > One question: > > > > c->table[i].exists = true; > > > > then we assume this table is still in the kernel and we don't recheck? > > Upon each COMMIT line, nft_action() calls nft_release_cache(). This will > also reset the 'exists' value to false. Thanks for explaining. I think the chain cache can also be converted to use linux list, right? > > I mean, if you pipe command to an open process running > > iptables-restore (which has been the recommended interface for years > > to avoid of the overhead of system() invocation and to ensure atomic > > updates), is there any cache this new approach might get out of sync? > > This is not just a problem of iptables-restore running in a pipe - > restoring a large ruleset (or just pure coincidence) could lead to the > same result. > > Playing with 'iptables-nft-restore --noflush' reading from stdin and > calling 'nft flush ruleset' in a second shell right before entering > 'COMMIT' leads to funny errors. This is not related to the table list > elimination though. I'll investigate. There is a generation number that the userspace sends to the kernel to validate that it's working with a stale cache to retry. This should help catch the interference scenario to basically (transparently) restart from scratch.