Re: [PATCH nft 2/2,v2] cache: recycle existing cache with incremental updates

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 23, 2024 at 02:57:07PM +0200, Pablo Neira Ayuso wrote:
> On Tue, Jul 23, 2024 at 02:19:25PM +0200, Pablo Neira Ayuso wrote:
> > On Tue, Jul 23, 2024 at 01:56:46PM +0200, Phil Sutter wrote:
> > > Some digging and lots of printf's later:
> > > 
> > > On Mon, Jul 22, 2024 at 11:34:01PM +0200, Pablo Neira Ayuso wrote:
> > > [...]
> > > > I can reproduce it:
> > > > 
> > > > # nft -i
> > > > nft> add table inet foo
> > > > nft> add chain inet foo bar { type filter hook input priority filter; }
> > > > nft> add rule inet foo bar accept
> > > 
> > > This bumps cache->flags from 0 to 0x1f (no cache -> NFT_CACHE_OBJECT).
> > > 
> > > > nft> insert rule inet foo bar index 0 accept
> > > 
> > > This adds NFT_CACHE_RULE_BIT and NFT_CACHE_UPDATE, cache is updated (to
> > > fetch rules).
> > > 
> > > > nft> add rule inet foo bar index 0 accept
> > > 
> > > No new flags for this one, so the code hits the 'genid == cache->genid +
> > > 1' case in nft_cache_is_updated() which bumps the local genid and skips
> > > a cache update. The new rule then references the cached copy of the
> > > previously commited one which still does not have a handle. Therefore
> > > link_rules() does it's thing for references to  uncommitted rules which
> > > later fails.
> > > 
> > > Pablo: Could you please explain the logic around this cache->genid
> > > increment? Commit e791dbe109b6d ("cache: recycle existing cache with
> > > incremental updates") is not clear to me in this regard. How can the
> > > local process know it doesn't need whatever has changed in the kernel?
> > 
> > The idea is to use the ruleset generation ID as a hint to infer if the
> > existing cache can be recycled, to speed up incremental updates. This
> > is not sufficient for the index cache, see below.
> 
> I have to revisit e791dbe109b6d, another process could race to bump
> the generation ID incrementally and I incorrectly assumed cache is
> consistent.

It might be fine, because cache->genid != 0 means we have fetched from
kernel previously and thus also committed a change (list commands set
CACHE_REFRESH). Kernel genid is expectedly cache->genid + 1, a
concurrent commit would bump again.

I don't like the commit because it breaks with the assumption that
kernel genid matching cache genid means cache is up to date. It may
indeed be, but I think it's thin ice and caching code is pretty complex
as-is. :/

Cheers, Phil




[Index of Archives]     [Netfitler Users]     [Berkeley Packet Filter]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux