Re: [PATCH nft 2/3] src: Add support for concatenated set ranges

Phil Sutter <phil@xxxxxx> · Wed, 20 Nov 2019 13:53:08 +0100

Hi,

On Wed, Nov 20, 2019 at 12:49:54PM +0100, Stefano Brivio wrote:
> On Tue, 19 Nov 2019 23:12:38 +0100
> Phil Sutter <phil@xxxxxx> wrote:
> > On Tue, Nov 19, 2019 at 02:07:11AM +0100, Stefano Brivio wrote:
> > [...]
> > > diff --git a/src/netlink.c b/src/netlink.c
> > > index 7306e358..b8bfd199 100644
> > > --- a/src/netlink.c
> > > +++ b/src/netlink.c  
> > [...]
> > > @@ -199,10 +210,39 @@ static void netlink_gen_concat_data(const struct expr *expr,
> > >  		memset(data, 0, sizeof(data));
> > >  		offset = 0;
> > >  		list_for_each_entry(i, &expr->expressions, list) {
> > > -			assert(i->etype == EXPR_VALUE);
> > > -			mpz_export_data(data + offset, i->value, i->byteorder,
> > > -					div_round_up(i->len, BITS_PER_BYTE));
> > > -			offset += netlink_padded_len(i->len) / BITS_PER_BYTE;
> > > +			if (i->etype == EXPR_RANGE) {
> > > +				const struct expr *e;
> > > +
> > > +				if (is_range_end)
> > > +					e = i->right;
> > > +				else
> > > +					e = i->left;
> > > +
> > > +				offset += netlink_export_pad(data + offset,
> > > +							     e->value, e);
> > > +			} else if (i->etype == EXPR_PREFIX) {
> > > +				if (is_range_end) {
> > > +					mpz_t v;
> > > +
> > > +					mpz_init_bitmask(v, i->len -
> > > +							    i->prefix_len);
> > > +					mpz_add(v, i->prefix->value, v);
> > > +					offset += netlink_export_pad(data +
> > > +								     offset,
> > > +								     v, i);  
> > 
> > Given the right-alignment, maybe introduce __netlink_gen_concat_data()
> > to contain the loop body?
> 
> While at it, I would also drop the if (1) that makes this function not
> so pretty. It was introduced by 53fc2c7a7998 ("netlink: move data
> related functions to netlink.c") where data was turned from a allocated
> buffer to VLA, but I don't see a reason why we can't do:
> 
> 	unsigned int len = expr->len / BITS_PER_BYTE, offset = 0;
> 	unsigned char data[len];

ACK!

> So, the whole thing would look like (together with the other change
> you suggest):
> 
> --
> static int netlink_gen_concat_data_expr(const struct expr *i,
> 					unsigned char *data)
> {
> 	if (i->etype == EXPR_RANGE) {
> 		const struct expr *e;
> 
> 		if (i->flags & EXPR_F_INTERVAL_END)
> 			e = i->right;
> 		else
> 			e = i->left;
> 
> 		return netlink_export_pad(data, e->value, e);
> 	}
> 
> 	if (i->etype == EXPR_PREFIX) {
> 		if (i->flags & EXPR_F_INTERVAL_END) {
> 			int count;
> 			mpz_t v;
> 
> 			mpz_init_bitmask(v, i->len - i->prefix_len);
> 			mpz_add(v, i->prefix->value, v);
> 			count = netlink_export_pad(data, v, i);
> 			mpz_clear(v);
> 			return count;
> 		}
> 
> 		return netlink_export_pad(data, i->prefix->value, i);
> 	}
> 
> 	assert(i->etype == EXPR_VALUE);
> 
> 	return netlink_export_pad(data, i->value, i);
> }

I would even:

| static int
| netlink_gen_concat_data_expr(const struct expr *i, unsigned char *data)
| {
| 	mpz_t *valp = NULL;
| 
| 	switch (i->etype) {
| 	case EXPR_RANGE:
| 		i = (i->flags & EXPR_F_INTERVAL_END) ? i->right : i->left;
| 		break;
| 	case EXPR_PREFIX:
| 		if (i->flags & EXPR_F_INTERVAL_END) {
| 			int count;
| 			mpz_t v;
| 
| 			mpz_init_bitmask(v, i->len - i->prefix_len);
| 			mpz_add(v, i->prefix->value, v);
| 			count = netlink_export_pad(data, v, i);
| 			mpz_clear(v);
| 			return count;
| 		}
| 		valp = &i->prefix->value;
| 		break;
| 	case EXPR_VALUE:
| 		break;
| 	default:
| 		BUG("invalid expression type '%s' in set", expr_ops(i)->name);
| 	}
| 
| 	return netlink_export_pad(data, valp ? *valp : i->value, i);
| }

But that's up to you. :)

> static void netlink_gen_concat_data(const struct expr *expr,
> 				    struct nft_data_linearize *nld)
> {
> 	unsigned int len = expr->len / BITS_PER_BYTE, offset = 0;
> 	unsigned char data[len];
> 	const struct expr *i;
> 
> 	memset(data, 0, len);
> 
> 	list_for_each_entry(i, &expr->expressions, list)
> 		offset += netlink_gen_concat_data_expr(i, data + offset);
> 
> 	memcpy(nld->value, data, len);
> 	nld->len = len;
> }
> --
> 
> Is that better?

Looks great, thanks!

[...]
> > So this is the fifth copy of the same piece of code. :(
> > 
> > Isn't there a better way to solve that?
> 
> I couldn't think of any.

I guess we would need an intermediate state which is 'multiton_rhs_expr
DOT multiton_rhs_expr'. Might turn into a mess as well. :)

> > If not, we could at least introduce a function compound_expr_alloc_or_add().
> 
> Right, added. The only ugly aspect is that we also need to update the
> location of $3 here, and I don't think we should export
> location_update() from parser_bison.y, so this function needs to be in
> parser_byson.y itself, I guess.

Yes, that's fine with me. It's just my c'n'p aversion which got worse
when spending time with iptables code.

[..]
> > > -	if (set->flags & NFT_SET_INTERVAL &&
> > > +	if (set->flags & NFT_SET_INTERVAL && !(set->flags & NFT_SET_SUBKEY) &&  
> > 
> > Maybe introduce set_is_non_concat_range() or something? Maybe even a
> > macro, it's just about:
> > 
> > | return set->flags & (NFT_SET_INTERVAL | NFT_SET_SUBKEY) == NFT_SET_INTERVAL;
> 
> I'm adding that as static inline together with the other set_is_*()
> functions in rule.h, it looks consistent.

ACK!

[...]
> > I don't understand this algorithm.
> 
> It basically does the same thing as the function you wrote below, but
> instead of shifting 'start' and 'end', 'step' is increased, and set
> onto them, so that at every iteration we have the resulting mask
> available in 'base'.
> 
> That's not actually needed here, though. It's a left-over from a
> previous idea of generating composing netmasks in nftables rather than
> in the set. I'll switch to the "obvious" implementation.
> 
> > Intuitively, I would just:
> > 
> > | int tmp = len;
> > | while (start != end && !(start & 1) && (end & 1) && tmp) {
> > |	start >>= 1;
> > |	end >>= 1;
> > |	tmp--;
> > | }
> > | return (tmp && start == end) ? len - tmp : -1;
> > 
> > Is that slow when dealing with gmp?
> 
> I don't think so, it also avoid copies and allocations, while shifting
> and setting bits have comparable complexities. I'd go with the gmp
> version of this.

OK, thanks for explaining!

Cheers, Phil