Re: update element timeout support [was Re: [PATCH nf 1/2] netfilter: nft_set_rbtree: move sync GC from insert path to set->ops->commit]

Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> · Mon, 2 Oct 2023 23:10:38 +0200

On Mon, Oct 02, 2023 at 03:58:38PM +0200, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote:
> > On Sun, Oct 01, 2023 at 11:08:16PM +0200, Florian Westphal wrote:
> > > Element E1, times out in 1 hour
> > > Element E2, times out in 1 second
> > > Element E3, timed out (1 second ago, 3 minutes ago, doesn't matter).
> > > 
> > > Userspace batch to kernel:
> > > Update Element E1 to time out in 2 hours.
> > > Update Element E2 to time out in 1 hour.
> > > Update Element E3 to time out in 1 hour.
> > > 
> > > What is the expected outcome of this request?
> > > 
> > > Ignore E3 being reaped already and refresh the timeout (resurrection?)
> > 
> > No resurrection, the element might have counters, it already expired.
> 
> OK.
> 
> > > Ignore E3 being reaped already and ignore the request?
> > > Fail the transaction as E3 timed out already ("does not exist")?
> > 
> > Add a new E3. If NLM_F_EXCL is specified, then fail with "does not exist"
> 
> OK.

Actually not correct what I said in this case: NLM_F_EXCL means create
in newsetelem() path, then add new E3 always succeeds if there is a
timed out E3, regardless the flag.

No "does not exist" error is possible.

> > > Now, what about E2?  If transaction is large, it could become
> > > like E3 *during the transaction* unless we introduce some freezing
> > > mechanism.  Whats the expected outcome?
> > > 
> > > Whats the expected outcome if there is some other, unrelated
> > > failure? I assume we have to roll back all the timeout updates, right?
> > 
> > We annotate the new timeout in transaction object, then refresh the
> > timeout update in the commit phase.
> 
> OK, so as per "E3-example", you're saying that if E2 expires during
> the transaction, then if F_EXCL is given the transaction will fail while
> otherwise it will be re-added.

Please, revisit if the semantics look correct to you after my
correction on the NLM_F_EXCL flag.

> > > If so, why not temporarily make the timeouts effective right away
> > > and then roll back?
> > 
> > You mean, from the preparation phase? Then we need to undo what has
> > been done, in case of --check / abort path is exercised, this might
> > just create a bogus element listing.
> 
> True.  Am I correct that we can't implement the "expand" via
> del+add because of stateful objects?

I think it is not a good idea to expand a element that has already
expired. There might be another possible corner case:

Refresh element E1 with timeout X -> not expired yet
Element E1 expires
Refresh element E1 with timeout Y -> already expired, ENOENT.

This looks fine to me, this handling is possible because the timeout
is not updated from the preparation phase, only later in the commit
phase.

> I fear we will need to circle back to rbtree then, I'll followup
> there (wrt. on-demand gc).
>
> > No need for rollback if new timeout is store in the transaction
> > object, we just set the new timeout from _commit() step in the
> > NEWSETELEM case, which has to deal with updates. Other objects follow
> > a similar approach.
> 
> Got it.