Re: [PATCH nft 0/4] Interval overlap detection for named sets

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 25.04, Pablo Neira Ayuso wrote:
> On Mon, Apr 25, 2016 at 05:59:56PM +0100, Patrick McHardy wrote:
> > On 25.04, Pablo Neira Ayuso wrote:
> > > There is issue, eg:
> > > 
> > > # nft add element ip x myset { 1.1.1.0/24 }
> > > # nft add element ip x myset { 1.1.1.1 }
> > > # nft list ruleset
> > > table ip x {
> > >         set myset {
> > >                 type ipv4_addr
> > >                 flags interval
> > >                 elements = { 1.1.1.0, 1.1.1.1}
> > >         }
> > >         ...
> > > }
> > > 
> > > If we don't reject the above by now, the listing back to userspace is
> > > not consistent. This is not trivial to resolve since the
> > > representation in the kernel (rbtree) are individual nodes. For the
> > > example above we get a representation like:
> > > 
> > > { 0.0.0.0/end, 1.1.1.0, 1.1.1.1, 1.1.1.2/end, 1.1.1.255/end }
> > > 
> > > This is hard to reconstruct from userspace given that intervals are
> > > not easy to identify anymore after this situation.
> > 
> > Yes, the 1.1.1.2/end is also incorrect, which is what messes up the
> > reconstruction.
> 
> It is not only this, if you add this:
> 
> 192.168.0.100-192.168.0.200
> 
> and then later on:
> 
> 192.168.100.0/24
> 
> You get:
> 
> { 0.0.0.0/end, 192.168.0.0, 192.168.0.100, 192.168.0.201/end, 192.168.1.0/end }
> 
> and the interval reconstruction back on the listing becomes inconsistent.
> 
> On top of that, deletion breaks this even further since the user would
> use what nft reconstructs from those overlapping interval (which is
> not what explicit inserted). Then, trying to delete element from what
> we get back via listing makes this even worse as we end up leaving the
> set in more inconsistent state and triggering bogus mismatches.

That's a good point. For consistency reasons, deletions would have to punch
a hole in existing ranges. I guess its a question of how to think of set
elements, do we think of them as containing ranges or every element within a
range? The later appears to more practical and also solve these problems, IOW
if we have 192.168.0.0/24 and delete 192.168.0.255 the resulting set contains
a range .0-.254.

> > For sets its quite easy. If we insert something that is completely overlapped,
> > there's nothing to do anyways. We simply do nothing. If we insert somsething
> > that extends an interval in one of both directions, we adjust it appropriately.
> 
> As I said, this adjustment can be done from userspace by deleting the
> previous interval and by adding the new one in the same batch.
> 
> > We already do exactly this when adding multiple elements to a set in a single
> > step:
> > 
> > # nft --debug=segtree,netlink filter output ip saddr { 192.168.1.0/24, 192.168.1.100 } counter
> 
> Just a side note to be clear: I'm not modifying the behaviour for
> anonymous sets in my patches.

Yeah, this was just for demonstration of what the seqtree does in this case.

> > __set%d filter 7
> > insert: [c0a80100 c0a801ff]
> > insert: [c0a80164 c0a80164]
> > split [c0a80100 c0a801ff]
> > iter: [c0a80100 c0a80163]
> > iter: [c0a80164 c0a80164]
> > iter: [c0a80165 c0a801ff]
> > list: [00000000 c0a800ff]
> > list: [c0a80100 c0a801ff]
> > list: [c0a80200 ffffffff]
> > 
> > __set%d filter 0
> > 	element 00000000  : 1 [end]	element 0001a8c0  : 0 [end]	element 0002a8c0  : 1 [end]
> > 
> > Simply keeping track of the seqtree changes and applying those to the kernel
> > should handle this fine.
> 
> Assuming the shadowed interval is in the kernel, we can delete it and
> add the new one in the same go from the batch.
> 
> Assuming the new interval is shadowed by what we have in the kernel,
> we can just send no update.

Yeah, that's even easier, the tracking might just reduce a few deletions that
we will immediately re-add.

> > > > For maps something like this does make sense, but for sets it only
> > > > makes it harder to use.
> > > > 
> > > > Generally, we have a conflict resolution based on size, the more specific
> > > > element wins. The assumption being that if you add something generic and
> > > > something more specific, the more specific item is an exception to the
> > > > more generic one.
> > > 
> > > Conflicts can be resolved through merges from userspace at some point.
> > > Basically the idea is that, when maps are not in place, the smaller
> > > interval will get removed and the new larger one is added in the same
> > > batch, so this happens in the same go. nft can deal with these
> > > overlaps so this is easier to users.
> > 
> > The conflict resolution policy needs to be discussed. As I explained, we
> > already do this, but use a different policy (try adding a map with
> > two overlapping ranges). This of course so far only applies when the
> > overlapping elements are added in a single step, so using a different
> > policy for successive additions might make sense, however its something
> > I'd like to discuss first.
> >
> > > This requires a bit more work on the caching side, but I would like to
> > > provide a more conservative solution by now by rejecting overlaps.
> > 
> > I fully agree for maps for now, but for sets it seems rather simple to do
> > a proper adjustment. I can give it a try if you want.
> 
> Frankly, I don't see any reason not to push this update.
>
> You can follow up with an incremental update if you want, but I would
> like to see this is working in the next release. The current approach
> is conservative in the sense that it rejects what can take us to
> problems, we can enhance this by making nft smarter in the sense of
> allowing this automatic adjustments.

I agree. Let me have a closer look at the details, but no objections from me.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Netfitler Users]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux