Re: [PATCH v3 nf-next 2/2] netfilter: nf_tables: support updating userdata for nft_table

Quan Tian <tianquan23@xxxxxxxxx> · Thu, 14 Mar 2024 01:08:55 +0800

On Wed, Mar 13, 2024 at 10:40:05AM +0100, Pablo Neira Ayuso wrote:
> On Wed, Mar 13, 2024 at 09:35:10AM +0800, Quan Tian wrote:
> > On Tue, Mar 12, 2024 at 11:23:56PM +0100, Pablo Neira Ayuso wrote:
> [...]
> > > I don't have a use-case for this table userdata myself, this is
> > > currently only used to store comments by userspace, why someone would
> > > be willing to update such comment associated to a table, I don't know.
> > 
> > There was a use-case in kube-proxy that we wanted to use comment to
> > store version/compatibility information so we could know whether it
> > has to recreate the whole table when upgrading to a new version due to
> > incompatible chain/rule changes (e.g. a chain's hook and priority is
> > changed). The reason why we wanted to avoid recreating the whole table
> > when it doesn't have to is because deleting the table would also
> > destory the dynamic sets in the table, losing some data.
> 
> There is a generation number which gets bumped for each ruleset
> update which is currently global.
> 
> Would having such generation ID per table help or you need more
> flexibility in what needs to be stored in the userdata area?

Auto-increased generation ID may not meet the above requirement as the
table could also be frequently updated by configuration changes at
runtime, not only after the application upgrades. An example scenario
could be: say we have a loadbalancer application based on nftables:

* LB v0.1.0 installs a collection of nftable rules;
* LB v0.1.1 makes some changes compatible with v0.1.0. In upgrade case,
  it doesn't need to recreate the whole table if the existing rules
  were installed by version >= 0.1.0;
* LB v0.2.0 makes some changes incompatible with v0.1.x (e.g. some
  chains' priorities are changed), it needs to recreate the whole table
  when upgrading from v0.1.x to it;
* LB v0.2.1 makes some changes compatible with v0.2.0, it doesn't need
  to recreate the table when upgrading from v0.2.0 to it but needs to
  recreate it when upgrading from v0.1.x to it.

With the support for comment updates, we can store, get, and update
such version information in comment, and use it to determine when it's
necessary to recreate the table.

>> If there is a simple way to detect and reject
>> this then I believe its better to disallow it.
>
> That requires to iterate over the list of transaction, or add some
> kind of flag to reject this.

I tried to detect back-to-back userdata comment updates, but found it's
more complex than I thought. A flag may not work because it can only
tell whether the 1st update touched comment and we don't know whether
the 2nd update is the same as the 1st one, unless we iterate over the
list of transaction, which looks a bit overkill? It seems simpler if we
just allow it and accept the last userdata.

>> My question is, should we instead leave the existing udata as-is and not
>> support removal, only replace?
>
>I would leave it in place too if no _USERDATA is specified.

I got a question when trying to implementing this. If the update request
specifies USERDATA but its length is 0, do we treat it like not
specified, or remove the existing userdata? Asking because I see
nf_tables_newrule() treats 0 length in the same way as unspecified and
doesn't initialize a nft_userdata. It makes sense for create, but I'm
not sure if it's right to do it in the same way for update. 

And if we want to treat it as "remove comment", we can't simply add a
single pointer of nft_userdata to nft_trans_table to achieve it, because
there would be 3 cases: leave it in place, remove it, update it.

Thanks,
Quan