jamal wrote: > On Thu, 2010-01-14 at 16:37 +0100, Patrick McHardy wrote: >> jamal wrote: > >>> Agreed that this would be a main driver of such a feature. >>> Which means that you need zones (or whatever noun other people use) to >>> work on not just netfilter, but also routing, ipsec etc. >> Routing already works fine. I believe IPsec should also work already, >> but I haven't tried it. > > maybe further discussion would clarify this point.. > >> The zone is set based on some other criteria (in this case the >> incoming device). > > If you are using a netdev as a reference point, then I take it > if you add vlans should be possible to do multiple zones on a single > physical netdev? Or is there some other way to satisfy that? Yes, you can assign a zone to each netdev. macvlan will also work. Using a netfilter target for the raw table might be a better choice on second thought though, it provides more flexibility and avoids the netfilter-specific device setting. I'll probably change that. >> The packets make one pass through the stack >> to a veth device and are SNATed in POSTROUTING to non-clashing >> addresses. > > Ok - makes sense. > i.e NAT would work; and policy routing as well as arp would be fine. > Also it looks to be sufficiently useful to fit a specific use case you > are interested in. > But back to my question on routing, ipsec etc (and you may not be > interested in solving this problem, but it is what i was getting to > earlier). Lets take for example: > a) network tables like SAD/SPD tables: how you would separate those on a > per-zone basis? i.e 10.0.0.1/zone1 could use different > policy/association than 10.0.0.1/zone2 The selectors include an ifindex, which could be used to distinguish both based on the interface. > b) dynamic protocols (routing, IKE etc): how do you do that without > making both sides understand what is going on? In case of IPsec the outer addresses are different, its only the selectors which will have similar addresses. A keying deamon should have no trouble with this. The ifindex would be needed in the selectors though to make sure each policy is used for the correct traffic. A routing daemon is unrealistic to be used in this scenario, at least a single one for all the overlapping networks. >>> This is a valid concern against the namespace approach. Existing tools >>> of course could be taught to know about namespaces - and one could >>> argue that if you can resolve the overlap IP address issue, then you >>> _have to_ modify user space anyways. >> I don't think thats true. > > Refer to my statements above for an example. > >> In any case its completely impractical >> to modify every userspace tool that does something with networking >> and potentially make complex configuration changes to have all >> those namespaces interact nicely. > > Agreed. But the major ones like iproute2 etc could be taught. We have > namespaces in the kernel already, over a period of time I think changing > the user space tools would a sensible evolution. Yes, that might be useful in any case. But I don't think it would even work for iproute or other standalone programs, a process can't associate to an existing namespace except through clone(). So it needs to run as child of a process already associated with the namespace. >> Currently they are simply not >> very well suited for virtualizing selected parts of networking. > > My contention is that it is a lot less headache to just virtualize > all the network stack and then use what you want than it is to go and > selectively changing the network objects. > Note: if i wanted today i could run racoon on every namespace > unchanged and it would work or i could modify racoon to understand > namespaces... See above. >> I'm not sure whether there is a typical user for overlapping >> networks :) I know of setups with ~150 overlapping networks. >> >> The number of conntracks per zone doesn't matter since the >> table is shared between all zones. network namespaces would >> allocate 150 tables, each of the same size, which might be >> quite large. > > Thats what i was looking for .. > So the difference, to pick the 150 zones example so as to put a number > around it, is namespaces will consume 150.X bytes (where X is the > overhead of a conntrack table) and you approach will be (X + 152) bytes, > correct? > What is the typical sizeof X? No, to give some correct number. Assuming a conntrack table of 10MB (large, but reasonable depending on the number of connections) we get an overhead of: namespaces: 150 * 10MB memory use "zones": 152 bytes increased code size Both approaches additionally need one extra connection tracking entry of ~300 bytes per connection that is actually handled twice. >>> You may also wanna look as a metric at code complexity/maintainability >>> of this scheme vs namespace (which adds zero changes to the kernel). >> There's not a lot of complexity, its basically passing a numeric >> identifier around in a few spots and comparing it. Something like >> TOS handling in the routing code. > > I think the challenge is whether zones will have to encroach on other > net stack objects or not. You are already touching structure netdev... That will go away once I add a target for classification. I completely agree that its undesirable to add this in more spots, but this is meant purely for being able to pass traffic through conntrack/NAT more than once. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html