Re: [PATCH nft 2/2,v2] cache: recycle existing cache with incremental updates

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 23, 2024 at 03:30:07PM -0400, Eric Garver wrote:
> On Tue, Jul 23, 2024 at 04:34:17PM +0200, Phil Sutter wrote:
> > On Tue, Jul 23, 2024 at 02:19:19PM +0200, Pablo Neira Ayuso wrote:
> > > On Tue, Jul 23, 2024 at 01:56:46PM +0200, Phil Sutter wrote:
> > > > Some digging and lots of printf's later:
> > > > 
> > > > On Mon, Jul 22, 2024 at 11:34:01PM +0200, Pablo Neira Ayuso wrote:
> > > > [...]
> > > > > I can reproduce it:
> > > > > 
> > > > > # nft -i
> > > > > nft> add table inet foo
> > > > > nft> add chain inet foo bar { type filter hook input priority filter; }
> > > > > nft> add rule inet foo bar accept
> > > > 
> > > > This bumps cache->flags from 0 to 0x1f (no cache -> NFT_CACHE_OBJECT).
> > > > 
> > > > > nft> insert rule inet foo bar index 0 accept
> > > > 
> > > > This adds NFT_CACHE_RULE_BIT and NFT_CACHE_UPDATE, cache is updated (to
> > > > fetch rules).
> > > > 
> > > > > nft> add rule inet foo bar index 0 accept
> > > > 
> > > > No new flags for this one, so the code hits the 'genid == cache->genid +
> > > > 1' case in nft_cache_is_updated() which bumps the local genid and skips
> > > > a cache update. The new rule then references the cached copy of the
> > > > previously commited one which still does not have a handle. Therefore
> > > > link_rules() does it's thing for references to  uncommitted rules which
> > > > later fails.
> > > > 
> > > > Pablo: Could you please explain the logic around this cache->genid
> > > > increment? Commit e791dbe109b6d ("cache: recycle existing cache with
> > > > incremental updates") is not clear to me in this regard. How can the
> > > > local process know it doesn't need whatever has changed in the kernel?
> > > 
> > > The idea is to use the ruleset generation ID as a hint to infer if the
> > > existing cache can be recycled, to speed up incremental updates. This
> > > is not sufficient for the index cache, see below.
> > 
> > So the assumption is that any changes to the ruleset in kernel were also
> > done to the cache, so we may just bump cache->genid instead of
> > refetching. In interactive nft, this requires for the last command to
> > have manually updated the cache as needed though. This is what confused
> > me. E.g., adding some chains:
> > 
> > - add chain t c2
> >   - no cache yet, fetch initially, store current genid
> >   - commit the new chain (kernel bumps genid)
> > 
> > - add cahin t c3
> >   - local genid == kernel's genid + 1, so skip cache update and bump
> >     local genid instead
> >   - commit the new chain (kernel bumps genid)
> > 
> > So I think if NFT_CACHE_RULE_BIT is set, NFT_CACHE_UPDATE must be set as
> > well to be sure that a previous command updated the rule cache if if was
> > a rule-related one. Therefore the local genid bump trick should be
> > allowed only if !NFT_CACHE_RULE_BIT || NFT_CACHE_UPDATE.
> > 
> > This still does not fix for the case at hand, because it doesn't cover
> > for the fact that cache had been updated but the contained items were
> > modified in-kernel.
> > 
> > I think 'delete rule' command needs a fix, too: cmd_evaluate_delete()
> > does not drop the rule from cache. I can't come up with something that
> > breaks due to that, though.
> > 
> > > > > Error: Could not process rule: No such file or directory
> > > > 
> > > > BTW: There are surprisingly many spots which emit a "Could not process
> > > > rule:" error. I'm sure it may be provoked for non-rule-related commands
> > > > (e.g. the calls in nft_netlink()).
> > > > 
> > > > > add rule inet foo bar index 0 accept
> > > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > > 
> > > > > Cache woes. Maybe a bug in
> > > > > 
> > > > > commit e5382c0d08e3c6d8246afa95b7380f0d6b8c1826
> > > > > Author: Phil Sutter <phil@xxxxxx>
> > > > > Date:   Fri Jun 7 19:21:21 2019 +0200
> > > > > 
> > > > >     src: Support intra-transaction rule references
> > > > > 
> > > > > that uncover now that cache is not flushed and sync with kernel so often?
> > > > 
> > > > The commit by itself is fine, as long as the cache is up to date. The
> > > > problem is we have this previously inserted rule in cache which does not
> > > > have a handle although it was committed to kernel already. This is
> > > > something I don't see covered by e791dbe109b6d at all.
> > > 
> > > Yes, your commit relies on the frequent flush and resync at that time,
> > > which is correct.
> > > 
> > > See attached patch, it sets on NFT_CACHE_REFRESH so a fresh cache from
> > > the kernel is fetched in each run, it fixes the issue for me.
> > 
> > Yes, NFT_CACHE_REFRESH disables the conditional cache update skipping.
> > It is overly agressive though: If kernel's genid matches the cache, no
> > update should be needed.
> > 
> > > A quick recap:
> > > 
> > > Step #1 add rule inet foo bar accept
> > > 
> > > No cache is updated, because no NFT_CACHE_UPDATE is ever set on in
> > > this case. It is not possible to know that the that the next command
> > > will use the index feature.
> > > 
> > > After this point, the cache is inconsistent for this index feature.
> > 
> > This is expected: A simple 'add rule' command does not need a rule
> > cache, so even setting NFT_CACHE_UPDATE would not lead to a consistent
> > cache. This is fine, we should avoid fetching rules unless really
> > needed.
> > 
> > > Generation ID is checked, if cache is current, the previous rule in
> > > step #1 is not there in the cache.
> > 
> > Actually no rules at all, as NFT_CACHE_RULE_BIT remains unset.
> > 
> > > It should be possible to improve this logic to indicate that the cache
> > > supports updates via index, instead of forcing a cache dump from the
> > > kernel on each new index command.
> > 
> > Only if we start fetching rules for non index/handle 'rule add'/'rule
> > insert' commands, right? Also, new rules have to be replaced by their
> > kernel versions in cache to get the handles.
> > 
> > > So NFT_CACHE_UPDATE is only useful for intra-batch updates as you
> > > commit describes.
> > 
> > > diff --git a/src/cache.c b/src/cache.c
> > > index 4b797ec79ae5..1fda3f6939f5 100644
> > > --- a/src/cache.c
> > > +++ b/src/cache.c
> > > @@ -68,7 +68,7 @@ static unsigned int evaluate_cache_add(struct cmd *cmd, unsigned int flags)
> > >  
> > >  		if (cmd->handle.index.id ||
> > >  		    cmd->handle.position.id)
> > > -			flags |= NFT_CACHE_RULE | NFT_CACHE_UPDATE;
> > > +			flags |= NFT_CACHE_RULE | NFT_CACHE_UPDATE | NFT_CACHE_REFRESH;
> > >  		break;
> > >  	default:
> > >  		break;
> > 
> > Acked-by: Phil Sutter <phil@xxxxxx>
> 
> This patch fixes the failures around the index keyword. I see one more
> issue around set entries.
> 
> Notably, if the set add and element add are on separate lines (and thus
> round trips to the kernel) then the issue is not seen. Perhaps there are
> more instances with other stateful objects.
> 
> --->8---
> 
> # cat /tmp/foo
> add table inet foo
> add set inet foo bar { type ipv4_addr; flags interval; }; add element inet foo bar { 10.1.1.1/32 }
> add element inet foo bar { 10.1.1.2/32 }
> 
> # nft -i < /tmp/foo
> internal:0:0-0: Error: Could not process rule: File exists

Netlink debug output indicates what's going on:

| nft> add set inet foo bar { type ipv4_addr; flags interval; }; add element inet foo bar { 10.1.1.1/32 }
| bar foo 4
| bar foo 0
| 	element 0101010a  : 0 [end]	element 0201010a  : 1 [end]
| nft> add element inet foo bar { 10.1.1.2/32 }
| bar foo 0
| 	element 00000000  : 1 [end]	element 01000000  : 1 [end]
| 	element 0201010a  : 1 [end]	element 0301010a  : 1 [end]
| 	element 0201010a  : 0 [end]	element 0301010a  : 1 [end]

In contrast, the same with separate 'add set' and 'add element'
commands:

| nft> add set inet foo bar { type ipv4_addr; flags interval; };
| bar foo 4
| nft> add element inet foo bar { 10.1.1.1/32 }
| bar foo 0
| 	element 00000000  : 1 [end]	element 0101010a  : 0 [end]
| 	element 0201010a  : 1 [end]
| nft> add element inet foo bar { 10.1.1.2/32 }
| bar foo 0
| 	element 00000000  : 1 [end]	element 0201010a  : 0 [end]
| 	element 0301010a  : 1 [end]

So I assume the code does not recognize the previously committed
elements as such and tries to commit them again.

Reverting commit e791dbe109b6d helps here as well.

BTW: It's funny how your second reproducer proves my suspicion that
nft_cmd_error() may print "Could not process rule:" errors for non-rule
commands.

Cheers, Phil




[Index of Archives]     [Netfitler Users]     [Berkeley Packet Filter]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux