Re: [PATCH nf-next 3/8] nf_tables: Add set type for arbitrary concatenation of ranges

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Stefano Brivio <sbrivio@xxxxxxxxxx> wrote:
> +static bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set,
> +			      const u32 *key, const struct nft_set_ext **ext)
> +{
> +	struct nft_pipapo *priv = nft_set_priv(set);
> +	unsigned long *res_map, *fill_map;
> +	u8 genmask = nft_genmask_cur(net);
> +	const u8 *rp = (const u8 *)key;
> +	struct nft_pipapo_match *m;
> +	struct nft_pipapo_field *f;
> +	bool map_index;
> +	int i;
> +
> +	map_index = raw_cpu_read(nft_pipapo_scratch_index);

I'm afraid this will need local_bh_disable to prevent reentry from
softirq processing.

> +	rcu_read_lock();

All netfilter hooks run inside rcu read section, so this isn't needed.

> +static int pipapo_realloc_scratch(struct nft_pipapo_match *m,
> +				  unsigned long bsize_max)
> +{
> +	int i;
> +
> +	for_each_possible_cpu(i) {
> +		unsigned long *scratch;
> +
> +		scratch = kzalloc_node(bsize_max * sizeof(*scratch) * 2,
> +				       GFP_KERNEL, cpu_to_node(i));
> +		if (!scratch)
> +			return -ENOMEM;

No need to handle partial failures on the other cpu / no rollback?
AFAICS ->destroy will handle it correctly, i.e. next insertion may
enter this again and allocate a same-sized chunk, so AFAICS its fine.

But still, it looks odd -- perhaps add a comment that there is no need
to rollback earlier allocs.

> +
> +		kfree(*per_cpu_ptr(m->scratch, i));

I was about to ask what would prevent nft_pipapo_lookup() from accessing
m->scratch.  Its because "m" is the private clone.  Perhaps add a
comment here to that effect.

> + * @net:	Network namespace
> + * @set:	nftables API set representation
> + * @elem:	nftables API element representation containing key data
> + * @flags:	If NFT_SET_ELEM_INTERVAL_END is passed, this is the end element
> + * @ext2:	Filled with pointer to &struct nft_set_ext in inserted element
> + *
> + * In this set implementation, this functions needs to be called twice, with
> + * start and end element, to obtain a valid entry insertion. Calls to this
> + * function are serialised, so we can store element and key data on the first
> + * call with start element, and use it on the second call once we get the end
> + * element too.

What guaranttess this?
AFAICS userspace could send a single element, with either
NFT_SET_ELEM_INTERVAL_END, or only the start element.



[Index of Archives]     [Netfitler Users]     [Berkeley Packet Filter]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux