On Mon, May 20, 2019 at 02:06:20PM +0200, Florian Westphal wrote: > Phil Sutter <phil@xxxxxx> wrote: > > On Sun, May 19, 2019 at 06:45:08PM +0200, Phil Sutter wrote: > > [...] > > > The only way to make the above work is by keeping the original cache > > > copy around until mnl_batch_talk has finally succeeded or failed with > > > something else than ERESTART. > > > > How about a completely different approach: > > > > If memory serves right (and from reading the related Red Hat ticket), > > the actual problem we're trying to solve is that iptables-nft-restore > > creates NFT_MSG_DELTABLE only if that table exists already at the point > > of parsing but another client might create it in the mean time before > > committing. > > > > My idea for solving this was to unconditionally create NFT_MSG_NEWTABLE > > followed by NFT_MSG_DELTABLE - in case the table exists, the first one > > is a noop; in case the table doesn't exist, the second one won't provoke > > an error message from kernel space. > > Does that work even work? > new table x > del table x > add rule to x // table was deleted? > > Or are you talking about a new/del/new sequence? Oh, yes of course. Existing iptables-nft-restore does: 1) delete table x if exists 2) add table x 3) add table x content My idea is to: + 0) add table x - 1) delete table x if exists + 1) delete table x 2) add table x 3) add table x content > If it works, ok/fine, but it seems ugly. Did you consider rule insert with index in your batch replay logic? I did when duplicating it for nft, but decided it's not worth it and people using 'add rule ... index IDX' have been warned already anyway. > > Since NFT_MSG_DELTABLE removes the table recursively, we don't need to > > care about any content added by the other client. > > Yes, we can't do this in the --noflush case though. My state from our last talk about it was "--noflush users are screwed anyway". :) > > Or is this about a generic solution for all commands not just > > iptables-nft-restore (without --noflush)? > > Its for ipt-nft-restore, including --noflush. > > It would be good though to also speed up 'iptables-nft -A' later on > by eliding cache completely. The problem is that we can't unconditionally create NFT_MSG_NEWCHAIN because that will reset the chain policy if non-default. So we need at least a cache containing tables and chains, even for that simple rule append command. Cheers, Phil