On Mon, Dec 17, 2018 at 10:11:30PM +0100, Phil Sutter wrote: > On Mon, Dec 17, 2018 at 08:25:24PM +0100, Pablo Neira Ayuso wrote: > > On Mon, Dec 17, 2018 at 07:24:24PM +0100, Phil Sutter wrote: > > > Hi Pablo, > > > > > > On Mon, Dec 17, 2018 at 06:42:49PM +0100, Pablo Neira Ayuso wrote: > > > > On Thu, Dec 13, 2018 at 12:15:56PM +0100, Phil Sutter wrote: > > > > > Use recently introduced support for rules inside chains in libnftnl to > > > > > introduce a rule cache per chain instead of a global one. > > > > > > > > > > A tricky bit is to decide if cache should be updated or not. Previously, > > > > > the global rule cache was populated just once and then reused unless > > > > > being flushed completely (via call to flush_rule_cache() with > > > > > NULL-pointer table argument). Resemble this behaviour by introducing a > > > > > boolean indicating cache status and fetch rules for all chains when > > > > > updating the chain cache in nft_chain_list_get(). > > > > > > > > This is a rather large rewrite: Nice because it's going to speed up > > > > things. > > > > > > > > I think we should start doing ruleset generation tracking, IIRC this > > > > is missing in the existing codebase. Such approach requires to keep > > > > track of the ruleset generation number and invalidate caches. > > > > > > Do you think parallel ruleset updates are a problem? None of the xtables > > > tools are meant to stay running for long, so I didn't expect problems > > > there until now. Sure, loading 10k rules with iptables-nft-restore takes > > > a long time, but if kernel's ruleset changes in between, I don't see how > > > we could recover from that (apart from restarting the whole restore). > > > > Yes, restarting is the only way. > > > > I was not thinking on the "load big ruleset" via -restore problem > > (it's flushing the ruleset unless noflush option is used), instead > > incremental updates could be? eg. robot placing rules to be > > added/deleted/replaced in a batch file that is passed to > > iptables-restore to update an existing ruleset. > > Ah, I see. Do you think it's feasible to support that in iptables-nft > given that new users are probably better off if they use nftables > directly? Right indeed, nftables is better there, but you can also use iptables-restore to do incremental ruleset updates with noflush option in iptables. > > But in such case, the rules in a given chain are still consistent, > > since we check for interference there already via EINTR in netlink > > dumps. We just may end up caching chains with rules in different > > generations with this new approach. Probably this may end up to > > misleading error reporting in a heavy concurrent rules update > > environment. We could still easily detect this by enforcing checks for > > consistent generation idea (otherwise restart) and use > > NFNL_BATCH_GENID attribute for "batch is based on ruleset generation > > X semantics" (not used by userspace code yet). That could be done in a > > follow up patchset if needed. We can probably get rid of the global > > userspace lock for iptables at some point. > > Yes, indeed. My changes allow for fetching rules of a later generation > than the chains. At this point though, we either have a command which > doesn't need the rule cache or chain and rule fetching happens shortly > after each other. And once these have been fetched, no further cache > updates happen. We also have to be careful here, because at least my > patch fixing for wrong rule position when using '-I <NUM>' with > iptables-nft-restore heavily depends upon a chain which has been > removed/flushed earlier is not fetched again. Hence why I introduced > those booleans, basically avoiding cache updates despite a possible > asymmetry. Sounds fine. > That global userspace lock you're talking about is within kernel space? > I'm asking because the locking in userspace is not required with > iptables-nft (as opposed to legacy). Exactly. The userspace lock is backward compatibility stuff for scripts. OK, please revamp and send v3, thanks for addressing my feedback.