Re: [PATCH nft 2/2] debug: include kernel set information on cache fill

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote:
> On Fri, Nov 22, 2024 at 02:43:27PM +0100, Florian Westphal wrote:
> > Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote:
> > > > Sure, wasn't that the reason why you iniitially wanted to restrict this to
> > > > --netlink=debug?  What made you change your mind?
> > > 
> > > With large garbage collection cycle, this counter provides a hint to
> > > the user to understand that slots are still being consumed by expired
> > > elements.
> > 
> > But how / where is that relevant?
> > 
> > rbtree does gc at insert time.  We could extend rbtree to force gc
> > even if interval is huge in case we have many expired elements.
> > 
> > We could do this by making __nft_rbtree_insert() count the number
> > of expired nodes that it saw during traversal, then force gc at commit
> > time even if time_after_eq() isn't met.
> 
> IIRC, rbtree insert path already performs gc on-demand.

It doesn't do a full scan though.

Maybe lets take two steps back.  What is the actual issue that
needs to be resolved?

Even if nelems/count is dumped while concealing the
rbtree details, then its still confusing, you get
nelems 42 but no (or fewer) elements = { ... dumped
due to the timeout thing.

So in case we have to document that nelems/count isn't
the number of active elements but stored elements, including
the inactive ones, then we might as well not export this
and instead document consequence of large gc interval.

We could also do something even simpler: when we hit
size limit on dataplane insertion for TIMEOUT element,
expedite next gc scan if gc interval is > 10s (or some
other value -- don't want constant scans when set is full
with no timed out elements).

> I would really like to provide an alternative interface for the rbtree
> to allow for the same netlink representation as pipapo. I expected
> pipapo can replace rbtree by pipapo, but you mentioned in the past
> this could be an issue.

pipapo has other issues, just compare insert and delete times
of pipapo or hash or rbtree.

Even if thats not a concern, ATM userspace cannot force pipapo even if
it wanted to, so this is moot anyway.

> > I'd prefer to avoid this mess.
> 
> OK, then we assume this will be forever used for debugging only,
> unless rbtree is fully replaced.

Only if this fixup stuff is done in the kernel, which sabotages
debug output (conceals actual elements by some strategy rather
than just expose set->nelems).

> Please, let me have a look, if I fail or it is too ugly you can still
> ditch it and we can follow up with your approach.

OK.




[Index of Archives]     [Netfitler Users]     [Berkeley Packet Filter]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux