Re: [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


On Wed, Oct 04, 2023 at 10:46:23AM +0200, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote:
> > On Wed, Oct 04, 2023 at 10:07:02AM +0200, Florian Westphal wrote:
> > > Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote:
> > > > We will soon need NFT_MSG_GETRULE_RESET_NO_TIMEOUT to undo this combo
> > > > command semantics, from userspace this will require some sort of 'nft
> > > > reset table x notimeout' syntax.
> > > 
> > > NFT_MSG_GETRULE_RESET_NO_TIMEOUT sounds super ugly :/
> > > 
> > > Do you think we can add a flags attr that describes which parts
> > > to reset?
> > 
> > Sure. This will require one attribute for each object type, also
> > reject it where it does not make sense.
> > 
> > > No flags attr would reset everything.
> > 
> > Refreshing timers is a bad default behaviour.
> Looking at the "evolution" of the reset command in nftables.git
> I agree.  "reset counters" etc. was rather specific as to what
> is reset.
> > And how does this mix with the set element timeout model from
> > transaction? Now timers becomes a "moving target" again with this
> > refresh? Oh, this will drag commit_mutex to netlink dump path to avoid
> > that.
> The lock is needed to prevent *two* reset calls messing up the
> interal object state, e.g. quota or counters.

Agreed, no question there is a need for a lock from that path, the
discussion so far has been what type of lock, whether reset_spinlock
or commit_mutex.

reset_spinlock just makes sure that:

Cpu 1: nft list ruleset
Cpu 2: nft reset ruleset

do not show negative counters, as Phil reported with a test script.

commit_mutex just mitigates this issue, for reason see my reply below.

> We will need *some* type of lock for the commands where
> the reset logic is handled via dump path.
> At this point I think we should consider reverting ALL
> reset changes that use the dump path, because we also have
> the consistency issue:
> Cpu 1: nft list ruleset
> Cpu 2: nft reset ruleset
> Cpu 3: transaction, seq is bumped
> AFAIU Cpu2 will restart dump due to interrupt, so the listing
> will be consistent but will, on retry -- show counters zeroed
> in previous, inconsitent and suppressed dump.
> So I think the reset logic should be reworked to walk the rules
> and use the transaction infra to reset everything manually.
> I think we can optimize by letting userspace skip rules that
> lack a stateful object that cannot be reset.
> I don't think the dump paths were ever designed to munge existing
> data.  For those parts that provide full/consistent serialization,
> e.g. single rule is fetched, its fine.
> But whenever we may yield in between successive partial dumps, its not.

Yes, netlink dumps do not provide atomic listing semantics, that is
why there is the generation ID from userspace. I am trying to think of
a way to achieve this from the kernel but I did not come with any so

>From userspace, it should be possible to keep a generation ID per
object in the cache, so the cache becomes a collection of objects of
different generations, then when listing, just take the objects that
are still current. Or it should be possible to disambiguate this from
userspace with the u64 handle, ie. keep stale objects around and merge
them to new ones when fetching the counters.

But this is lots of work from userspace, and it might get more
complicated if more transactions happen while retrying (see test
script from Phil where transaction happens in an infinite loop).

This is the other open issue we have been discussing lately :)

> > For counters, this is to collect stats while leaving remaining things
> > as is. Refreshing timers make no sense to me.
> Looking at the history, I agree... for something like "reset counters"
> its clear what should happen.
> > For quota, this is to fetch the consumed quota and restart it, it
> > might make sense to refresh the timer, but transaction sounds like a
> > better path for this usecase?
> See above, I agree.

[Index of Archives]     [Netfitler Users]     [Berkeley Packet Filter]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux