Re: [PATCH 1/2 nf] netfilter: nft_set_bitmap: keep a list of dummy elements

Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> · Tue, 14 Mar 2017 16:21:53 +0100

On Tue, Mar 14, 2017 at 10:44:43PM +0800, Liping Zhang wrote:
> Hi Pablo,
> 2017-03-14 20:19 GMT+08:00 Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx>:
> [...]
> > Another possibility is to simply regard desc->size over the memory
> > scalability notation when provided. I think this just needs an update
> > from nft userspace. Look, bitmap and hashtable are both described as
> > O(1) in terms of performance. If the user provides the set size (this
> > is known in anonymous sets) we can select the one that takes less
> > memory. When no size is specified, we rely on the set policy that is
> > specified.
> >
> > Still, for anonymous sets we will select hashtable instead, this is
> > going to be slower in systems that have plenty of memory. I think we
> > cannot escape the new per-table global knob to select
> > memory/performance for anononymous sets after all.
> 
> After we implement more and more sets types, I think just based on
> POL_PERFORMANCE or POL_MEMORY to select a suitable set will
> become a more and more difficult task. So how about this method:
> 1. For compatibility, POL_PERFORMANCE means hash set, and
>     POL_MEMORY means rbtree set.(I know this maybe incorrect when
>     the set->size is 0)
> 2. When the user create the set, he(she) can specify a new settype to
>     select the set type, such as hash, rbtree, bitmap... a little similar to
>     ipset.
> 
> I know this method is not perfect, but this will provide big
> flexibility to the user.

Then, we cannot deprecate sets like the rbtree, that I'm very much in
favour to find a replacement, as it would be exposed to userspace and
anyone could be using it, and we cannot break existing user setups.

Moreover, if we, the developers, don't know exactly what is a good
choice, how can users just know what is best for them? I would prefer
developers come to us to tune the set backend selection so we get it
better. We can enhance this model incrementally.

Leaking details to userspace is easy, just a matter of exposing all
these knobs to userspace via netlink. If this turns out to be the way,
we'll do it at a given time, but I'm still willing to spend time on
this set backend selection routine.

> > I'm curious, what kind of device are you thinking of with such memory
> > restrictions that cannot take 320 kB? I would expect such embedded
> > device that cannot afford such memory consumption will come also with
> > a smallish cpu.
> 
> We had a small router with 32MB memory in my previous company.
> On such an embedded device, occupy 320KB is also no problem of
> course.
> 
> But I guess the user  will not happy to know the fact, inputting such a
> nft rule "nft add x y tcp dport {21, 22} drop" will consume more than
> 16KB memory :)

For such small usecase, we can expose something like:

table x {
        policy memory; <----

        chain y {
                type filter hook output priority 0; policy drop;

                tcp dport {22, 80} ct state established,related accept
        }
}

So the kernel knows memory is more important that performance, and
this policy exposes what the user needs. If not specified, the
performance representation is selected.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html