Re: [PATCH iptables 7/7] nft: support for dynamic register allocation

Phil Sutter <phil@xxxxxx> · Thu, 28 Apr 2022 12:07:21 +0200

On Wed, Apr 27, 2022 at 06:23:05PM +0200, Pablo Neira Ayuso wrote:
> On Tue, Apr 26, 2022 at 06:06:41PM +0200, Phil Sutter wrote:
> > On Sun, Apr 24, 2022 at 11:56:13PM +0200, Pablo Neira Ayuso wrote:
> [...]
> > > +
> > > +static int reg_space(int i)
> > > +{
> > > +	return sizeof(uint32_t) * 16 - sizeof(uint32_t) * i;
> > > +}
> > > +
> > > +static void register_track(const struct nft_handle *h,
> > > +			   struct nft_reg_ctx *ctx, int i, int len)
> > > +{
> > > +	if (ctx->reg >= 0 || h->regs[i].word || reg_space(i) < len)
> > > +		return;
> > 
> > Since ctx->reg is not reset in callers' loops and reg_space(i) is
> > monotonic, maybe make those loop exit conditions by returning false from
> > register_track() in those cases?
> 
> you mean:
> 
>         if (!register_track(...))
>                 continue;
> 
> ?
> 
> That defeats the check for a matching register already storing the
> data I need, ie.
> 
>         if (h->regs[i].type != NFT_REG_META)
>                 continue;
>         ...

Oh, indeed! It still holds for the 'reg_space(i) < len' case but it's
a corner case and not worth it.

> > > +	if (h->regs[i].type == NFT_REG_UNSPEC) {
> > > +		ctx->genid = h->reg_genid;
> > 
> > Is ctx->genid used in this case?
> 
> It used to shortcircuit the logic to evict a register (no eviction
> needed case), but that is not needed anymore since ctx->reg >= 0
> already prevents this.
> 
> > > +		ctx->reg = i;
> > > +	} else if (h->regs[i].genid < ctx->genid) {
> > > +		ctx->genid = h->regs[i].genid;
> > > +		ctx->evict = i;
> > 
> > What if the oldest reg is too small?
> 
> The reg_space(i) < len check prevents this?

Ah, I missed the fact that consecutive regs are simply overwritten in
the evict case. What I had in mind was something like this:

reg#	genid	size
--------------------
0	6	4
1	5	16
2	4	4
3	3	4
4	2	4
5	1	4

If that's all the space there is and we want to store 16 bytes, the
algorithm would evict reg2 and all the following while it could just
evict reg1 and preserve the others. I just realize this is the
optimization you're talking about here:

> > > +	} else if (h->regs[i].len == len) {
> > > +		ctx->evict = i;
> > > +		ctx->genid = 0;
> > 
> > Why prefer regs of same size over older ones?
> 
> this was an early optimization. An Ipv6 address might evict up four
> registers, if n stores old data, then n+1, n+2, n+3 store recent data,
> n+1, n+2, n+3 would be unfairly evicted.
> 
> I can remove this case: it is probably an early optimization. This is
> the initial version of the dynamic register allocation infra. It
> should be possible to catch for more suboptimal situations with real
> rulesets, by incrementally reviewing generated bytecode.

Sounds reasonable. I think there are more pitfalls regarding suboptimal
register choice in NFT_REG_UNSPEC case as it doesn't consider available
space either.

Anyway, the size consideration being done only for regs with same genid
as the last one being chosen for eviction seems wrong. Preferring a same
sized reg over older ones does not seem ideal to me, either. What do you
think about this:

| static void register_track(const struct nft_handle *h,
|                            struct nft_reg_ctx *ctx, int i, int len)
| {
|         if (ctx->reg >= 0 || h->regs[i].word || reg_space(i) < len)
|                 return;
| 
|         if (h->regs[i].type == NFT_REG_UNSPEC) {
|                 ctx->reg = i;
|         } else if (ctx->evict < 0 ||
|                    h->regs[ctx->evict].len < len ||
|                    (h->regs[i].len >= len &&
|                     h->regs[i].genid < ctx->genid)) {
|                 ctx->genid = h->regs[i].genid;
|                 ctx->evict = i;
|         }
| }

- If given register is free, use it and be done
- If there's no candidate for eviction, mark it as such
- If eviction candidate size is too small, prefer the current one
  (regardless of size)
- if both eviction candidate and current sizes are sufficient, prefer
  the older one

Cheers, Phil