Re: [PATCH nf-next v2 3/8] nf_tables: Add set type for arbitrary concatenation of ranges

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 27 Nov 2019 10:29:45 +0100
Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote:

> Hi Stefano,
> 
> Just started reading, a few initial questions.
> 
> On Fri, Nov 22, 2019 at 02:40:02PM +0100, Stefano Brivio wrote:
> [...]
>
> > +static int nft_pipapo_insert(const struct net *net, const struct nft_set *set,
> > +			     const struct nft_set_elem *elem,
> > +			     struct nft_set_ext **ext2)
> > +{
> > +	const struct nft_set_ext *ext = nft_set_elem_ext(set, elem->priv);
> > +	const u8 *data = (const u8 *)elem->key.val.data, *start, *end;
> > +	union nft_pipapo_map_bucket rulemap[NFT_PIPAPO_MAX_FIELDS];
> > +	struct nft_pipapo *priv = nft_set_priv(set);
> > +	struct nft_pipapo_match *m = priv->clone;
> > +	struct nft_pipapo_elem *e = elem->priv;
> > +	struct nft_pipapo_field *f;
> > +	int i, bsize_max, err = 0;
> > +	void *dup;
> > +
> > +	dup = nft_pipapo_get(net, set, elem, 0);
> > +	if (PTR_ERR(dup) != -ENOENT) {
> > +		priv->start_elem = NULL;
> > +		if (IS_ERR(dup))
> > +			return PTR_ERR(dup);
> > +		*ext2 = dup;  
> 
> dup should be of nft_set_ext type. I just had a look at
> nft_pipapo_get() and I think this returns nft_pipapo_elem, which is
> almost good, since it contains nft_set_ext, right?

Right, it returns nft_pipapo_elem which contains that.

> I think you also need to check if the object is active in the next
> generation via nft_genmask_next() and nft_set_elem_active(), otherwise
> ignore it.

I guess I should actually do this in nft_pipapo_get(), also because we
don't want to return inactive elements when userspace "gets" them.

I just noticed this is currently inconsistent with the lookup, because
nft_pipapo_lookup() correctly does:

--
next_match:
		b = pipapo_refill(res_map, f->bsize, f->rules, fill_map, f->mt,
				  last);
		if (b < 0) {
			raw_cpu_write(nft_pipapo_scratch_index, map_index);
			local_bh_enable();

			return false;
		}

		if (last) {
			*ext = &f->mt[b].e->ext;
			if (unlikely(nft_set_elem_expired(*ext) ||
				     !nft_set_elem_active(*ext, genmask)))
				goto next_match;
--

but I forgot to implement the same check in pipapo_get():

--
next_match:
		b = pipapo_refill(res_map, f->bsize, f->rules, fill_map, f->mt,
				  last);
		if (b < 0)
			goto out;

		if (last) {
			if (nft_set_elem_expired(&f->mt[b].e->ext))
				goto next_match;
--

this check should simply include || !nft_set_elem_active(...), and then
I wouldn't need any further check in nft_pipapo_init(). I'd fix this in
v3.

I'm actually not sure if I need to report these elements to
nft_pipapo_remove(). If it's needed, I would add some kind of
"get_inactive" flag to pipapo_get(), which is true on the call from
nft_pipapo_remove(), and false on other paths. If the flag is true, the
nft_set_elem_active() check is then skipped.

> Note that the datastructure needs to temporarily deal with duplicates,
> ie. one inactive object (just deleted) and one active object (just
> added) for the next generation.

Yes, this is taken care of (except for the problem described above),
specifically, there can be n inactive objects, and a single active
object that are entirely overlapping.

This makes some optimisations harder to implement, namely, step 5.2.1
from:
	https://pipapo.lameexcu.se/pipapo/tree/pipapo.c#n337

because we need to allow entirely overlapping entries and map them to
possibly distinct elements.

Now, I think this would all be easier if the API implemented
transactions and commit in a way that appears more natural to me.

When I started working on this, I initially thought activate() would be
called once per transaction, not per element, so that insert() and
remove() would add or remove elements pending for a given transaction,
and activate() would commit it. Same for flush().

At that point, we would have a copy of lookup data with pending
insertions and without pending deletions, and on transaction commit,
this copy would become active, with no inactive elements into it.
Hence, no overlapping elements in live data.

This way we could also make transactions atomic. If activate() is
called once for each element in the transaction, that can't be atomic.

I plan to work on this (if it makes sense), but it looks rather
complicated to match this with existing set implementations and
especially current UAPI, that's the main reason why I "worked around"
this aspect for the moment being. I guess that having at least one set
implementation that can play along with this model would help later.

> > +		return -EEXIST;
> > +	}
> > +
> > +	if (!nft_set_ext_exists(ext, NFT_SET_EXT_FLAGS) ||
> > +	    !(*nft_set_ext_flags(ext) & NFT_SET_ELEM_INTERVAL_END)) {
> > +		priv->start_elem = e;
> > +		*ext2 = &e->ext;
> > +		memcpy(priv->start_data, data, priv->width);
> > +		return 0;
> > +	}
> > +
> > +	if (!priv->start_elem)
> > +		return -EINVAL;  
> 
> I'm working on a sketch patch to extend the front-end code to make
> this easier for you, will post it asap, so you don't need this special
> handling to collect both ends of the interval.

Nice, thanks. Mind that I think this is actually a bit ugly but fine.
As I was mentioning to Florian, it doesn't present any particular race
with bad consequences (at least in v2).

Right now I was trying to get the NFTA_SET_DESC_CONCAT >
NFTA_LIST_ELEM > NFTA_SET_FIELD_LEN nesting implemented in libnftnl in
a somewhat acceptable way. Let me know if the front-end changes would
affect this significantly, I'll wait for your patch in that case.

> So far, just spend a bit of time on this, will get back to you with
> more feedback.
> 
> Thanks for working on this!

And thanks for reviewing it!

-- 
Stefano




[Index of Archives]     [Netfitler Users]     [Berkeley Packet Filter]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux