Re: Moving from ipset to nftables: Sets not ready for prime time yet?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Stefano,

On Fri, 3 Jul 2020, Stefano Brivio wrote:

> On Fri,  3 Jul 2020 00:30:10 +0200 (CEST)
> "Timo Sigurdsson" <public_timo.s@xxxxxxxxxxxxxx> wrote:
> 
> > Another issue I stumbled upon was that auto-merge may actually
> > generate wrong/incomplete intervals if you have multiple 'add
> > element' statements within an nftables script file. I consider this a
> > serious issue if you can't be sure whether the addresses or intervals
> > you add to a set actually end up in the set. I reported this here
> > [2]. The workaround for it is - again - to add all addresses in a
> > single statement.
> 
> Practically speaking I think it's a bug, but I can't find a formal,
> complete definition of automerge, so one can also say it "adds items up
> to and including the first conflicting one", and there you go, it's
> working as intended.
> 
> In general, when we discussed this "automerge" feature for
> multi-dimensional sets in nftables (not your case, but I aimed at
> consistency), I thought it was a mistake to introduce it altogether,
> because it's hard to define it and whatever definition one comes up
> with might not match what some users think. Consider this example:
> 
> # ipset create s hash:net,net
> # ipset add s 10.0.1.1/30,192.168.1.1/24
> # ipset add s 10.0.0.1/24,172.16.0.1
> # ipset list s
> [...]
> Members:
> 10.0.1.0/30,192.168.1.0/24
> 10.0.0.0/24,172.16.0.1
> 
> good, ipset has no notion of automerge, so it won't try to do anything
> bad here: the set of address pairs denoted by <10.0.1.1/30,
> 192.168.1.1/24> is disjoint from the set of address pairs denoted by
> <10.0.0.1/24, 172.16.0.1>. Then:
> 
> # ipset add s 10.0.0.1/16,192.168.0.0/16
> # ipset list s
> [...]
> Members:
> 10.0.1.0/30,192.168.1.0/24
> 10.0.0.0/16,192.168.0.0/16
> 10.0.0.0/24,172.16.0.1
> 
> and, as expected with ipset, we have entirely overlapping entries added
> to the set. Is that a problem? Not really, ipset doesn't support maps,
> so it doesn't matter which entry is actually matched.

Actually, the flags, extensions (nomatch, timeout, skbinfo, etc.) in ipset 
are some kind of mappings and do matter which entry is matched and which 
flags, extensions are applied to the matching packets.

Therefore the matching in the net kind of sets follow a strict ordering: 
most specific match wins and in the case of multiple dimensions (like 
net,net above) it goes from left to right to find the best most specific 
match.

> # nft add table t
> # nft add set t s '{ type ipv4_addr . ipv4_addr; flags interval ; }'
> # nft add element t s '{ 10.0.1.1/30 . 192.168.1.1/24 }'
> # nft add element t s '{ 10.0.0.1/24 . 172.16.0.1 }'
> # nft add element t s '{ 10.0.0.1/16 . 192.168.0.0/16 }'
> # nft list ruleset
> table ip t {
> 	set s {
> 		type ipv4_addr . ipv4_addr
> 		flags interval
> 		elements = { 10.0.1.0/30 . 192.168.1.0/24,
> 			     10.0.0.0/24 . 172.16.0.1,
> 			     10.0.0.0/16 . 192.168.0.0/16 }
> 	}
> }
> 
> also fine: the least generic entry is added first, so it matches first.
> Let's try to reorder the insertions:
> 
> # nft add element t s '{ 10.0.0.1/16 . 192.168.0.0/16 }'
> # nft add element t s '{ 10.0.0.1/24 . 172.16.0.1 }'
> # nft add element t s '{ 10.0.1.1/30 . 192.168.1.1/24 }'
> Error: Could not process rule: File exists
> add element t s { 10.0.1.1/30 . 192.168.1.1/24 }
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> ...because that entry would never match anything: it's inserted after a
> more generic one that already covers it completely, and we'd like to
> tell the user that it doesn't make sense.

I think sets should not store information about which order the entries 
were added. That should totally be indifferent. The input of the sets may 
come from countless sources and if the order of adding the entries matters 
then a preordering is required, which is sometimes non-trivial.
 
> Now, this is pretty much the only advantage of not allowing overlaps:
> telling the user that some insertion doesn't make sense, and thus it
> was probably not what the user wanted to do.

This makes also impossible to make exceptions in the sets in nftables - 
with the "nomatch" flag in ipset one can easily create exceptions in 
intentionally overlapping entries (in whatever deep nestings) in a single 
set. In practice it comes quite handy to say

ipset create access_to_servers hash:ip,port,net
ipset add access_to_servers your_ssh_server,22,x.y.z.0/24
ipset add access_to_servers your_ssh_server,22,x.y.z.32/27 nomatch
...

and exclude access to some parts of a given subnet.

However, the internals of the sets in nftables are totally different from 
ipset, so I'm pretty sure it's absolutely not trivial (and sometimes 
impossible) to provide exactly the same behaviour.

> So... I wouldn't know how deal with your use case, even in theory, in a
> consistent way. Should we rather introduce a flag that allows any type
> of overlapping (default with ipset), which is a way for the user to
> tell us they don't actually care about entries not having any effect?
> 
> And, in that case, would you expect the entry to be listed in the
> resulting set, in case of full overlap (where one set is a subset, not
> necessarily proper, of the other one)?
> 
> > [...]
> >
> > Summing up:
> > Well, that's quite a number of issues to run into as an nftables
> > newbie. I wouldn't have expected this at all. And frankly, I actually
> > converted my rules first and thought adjusting my scripts around
> > ipset to achieve the same with nftables sets would be straightforward
> > and simple... Maybe my approach or understanding of nftables is
> > wrong. But I don't think that the use case is that extraordinary that
> > it should be that difficult.
> 
> I don't think so either, still I kind of expect to see the issues you
> report as these features seem to start being heavily used just recently.
> 
> And I (maybe optimistically) think that all we need to iron out the
> most apparent issues on the subject is a few reports like yours, so
> thanks for sharing it.

Best regards,
Jozsef
-
E-mail  : kadlec@xxxxxxxxxxxxxxxxx, kadlecsik.jozsef@xxxxxxxxx
PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics
          H-1525 Budapest 114, POB. 49, Hungary



[Index of Archives]     [Netfitler Users]     [Berkeley Packet Filter]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux