Andreas B Aaen wrote: > On Friday 31 October 2008 00:07, Eric W. Biederman wrote: >> A global nsid breaks migration, > Yes. > >> it breaks nested containers, > Yes. > >> in general it just hurts. > No. > >> So it is a bad choice for an interface. > Not necessarily. There is a reason why vrf is designed the way it is - and the > patches that I have worked with had a similar design. > >> Personally if I have vrf I want to set up a test environment in a container >> so I can isolate it from the rest of the system. Allowing me to play with >> the user space side of the functionality without So these things are not >> completely separate concerns. > > Ok. Here is my use case. > I need a to talk to 500 IPv4 networks with possible overlapping IP addresses. > The packages arrive on 500 VLANs. I want one process to listen to a port on > each of these networks. I don't want 500 processes that runs in each their > network namespace and then communicate with each other through e.g. unix > sockets. This just complicates the task. Why don't you unshare 500 times in the same process ? In each namespace you create a socket control and the fd number is the identifier of your namespace. >> So from a design point of view I see the following questions. >> 1) How do we pin a network namespace to allow for routing when no process >> uses it? > We introduce a global namespace or at least a namespace that unique for a > process and it's sons. > Maybe a vrf container of network namespaces. > The vrf container numbers it's network namespaces. Each pid points to a vrf > container. New vrf containers can be made through e.g. unshare(). Migration > and nesting should be possible. > >> 2) How do we create sockets into that pinned network namespace? > Add a socket option that uses an index (global namespace) > >> 3) How do we enter that network namespace so that sockets by default are >> created in it? > I don't need this feature. The VRF patchset does this, so they can implement a > chvrf utillity. > >> All of these are technically easy things to implement and design wise a >> challenge. > Yes. > > As I see it network namespaces has provided the splitting of all the protocols > in the network code. This was the huge task. The vrf patches that I have seen > a few years back wasn't as mature as this. What's left is actually the > management of these network namespaces. > > binding network namespaces to processes isn't a good idea for all use cases. > >> The best solution I see at the moment is to have something (a fs) we can >> mount in the filesystem, keeping the network namespace alive as long as it >> is mounted. >> >> i.e >> mount -t netns none /dev/nets/1 >> mount -t netns -o newinstance none /dev/nets/2 >> >> (The new instance parameter creates the network namespace as well as >> capturing the current one) >> >> char netns[] = "/dev/nets/2" >> fd = socket(); >> err = setsockopt(fd, SOL_SOCKET, SO_NETPATH, netns, strlen(netns) + 1); > > So the idea here is to let the userspace side choose the naming and ensuring > the nesting possibility by using the filesystem. > > Would you configure this interface on "/dev/nets/2" like this: > > ip addr add 10.0.0.1/24 dev eth1 nets "/dev/nets/2" ? > > Where the "/dev/nets/2" parameter is set through a SO_NETPATH option to the > netlink socket that the iproute2 uses in it's implementation. > > Is this better or worse than a vrf container with numbered network namespaces > in? > > Regards, -- Sauf indication contraire ci-dessus: Compagnie IBM France Siège Social : Tour Descartes, 2, avenue Gambetta, La Défense 5, 92400 Courbevoie RCS Nanterre 552 118 465 Forme Sociale : S.A.S. Capital Social : 542.737.118 ? SIREN/SIRET : 552 118 465 02430 _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers