Hello David, You may try the patch below (kernel v2.6.35) and see if that helps. It basically does what you asked for: during namespace cleanup, move back the virtual interfaces to their original namespaces. I did some tests with veth pairs and nested netns's and everything worked fine. I think this should be the default behaviour, I would like if someone could review/fix this patch and push it upstream. Have a good day, Renato. 2011/2/26 Daniel Lezcano <daniel.lezcano@xxxxxxx> > On 02/26/2011 05:59 PM, Ward, David - 0663 - MITLL wrote: > > (Apologies for the cross-post, but Thunderbird messed up the formatting > > when I sent this originally, and then I realized I sent it to the wrong > > list.) > > > > A patch was applied to the kernel in November 2008 that deletes virtual > > network interfaces when network namespaces are cleaned up > > (d0c082cea6dfb9b674b4f6e1e84025662dbd24e8). A discussion about this > > patch took place on this list > > ( > https://lists.linux-foundation.org/pipermail/containers/2008-October/013460.html > ), > > where Daniel Lezcano wrote: > > > > > After discussing with Benjamin, this patch means an user can no longer > > > manage a pool of virtual devices because they will be automatically > > > destroyed when the namespace exits. I don't think it is a big concern, > > > but just in case I am asking :) > > > > I currently have two use cases where this behavior is not desirable: > > > > 1. I use a veth pair device to connect two containers together (as > > opposed to connecting a container to the host). To do this, I > > create the veth pair device manually in the host with iproute2 > > ("ip link add type veth"). Then when I start each container, it > > pulls in one of the interfaces of the veth pair device with > > "lxc.network.type = phys". When I stop one of the containers, its > > interface to the veth pair device is deleted instead of moved back > > to the host, so I can not just start the stopped container again > > and re-establish the same link. > > Maybe you can rely on the lxc configuration to do that. > > Assuming you create the two container always in the same order. > > The first one: > > lxc.network.type=veth > lxc.network.veth.pair=vethX > > The second one > > lxc.network.type=phys > lxc.network.link=vethX > > The drawback is you have to stop / start both of them. > > > Otherwise, why don't you use the macvlan configuration ? > > For both containers: > > lxc.network.type=macvlan > lxc.network.macvlan.mode=bridge > lxc.network.link=dummy0 > > > > 2. I start a process in the host that creates a TUN/TAP interface, > > such as a VPN client. I pull the TUN/TAP interface into the > > container with "lxc.network.type = phys". When the container > > exits, the TUN/TAP interface is deleted because it is a virtual > > interface, while the VPN client process continues to run in the > > host. Again I can not just start the container again with the > > same connection; I have to restart the VPN client. > > > > It makes sense that virtual network interfaces that get created inside a > > container should be deleted when the container exits. However, I feel > > that network interfaces from the host that get assigned to the container > > should be returned to the host when the container exits, whether they > > are physical or virtual. > > Wouldn't make sense to add a configuration option for lxc to create such > device and handle the vpn client ? > > There is the lxc.network.script.up option where you can launch your vpn > client. So adding the tun/tap interface as a network option, lxc will > create it for you and when it is up, the up script is invoked where the > vpn client is launched. > > The lxc.network.script.down does not exist yet, but it is quite easy to > add the option. > > What do you think ? > > > Can the kernel distinguish between network interfaces that were created > > inside the namespace, and network interfaces that were moved there? > > IMHO that will add more complexity to the network namespace, especially > to handle the nested namespaces. Furthermore that will impact the > current design. I am not really in favor of that as that was initial > behavior and there were limitations. > <javascript:void(0);> > _______________________________________________ > Containers mailing list > Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx > https://lists.linux-foundation.org/mailman/listinfo/containers > -- Renato Westphal
commit 4b938c007d9a20d7ee6753083d7a9c6b1f098671 Author: Renato Westphal <rwestphal@xxxxxxxxxxxx> Date: Sun Feb 27 02:07:56 2011 -0300 netns: Preserve imported virtual interfaces during namespace cleanup diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index b21e405..7cce799 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1019,6 +1019,8 @@ struct net_device { #ifdef CONFIG_NET_NS /* Network namespace this network device is inside */ struct net *nd_net; + /* Initial network namespace of this network device */ + struct net *nd_init_net; #endif /* mid-layer private */ diff --git a/net/core/dev.c b/net/core/dev.c index f3a24c4..16d9bc4 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -5830,6 +5830,7 @@ static struct pernet_operations __net_initdata netdev_net_ops = { static void __net_exit default_device_exit(struct net *net) { struct net_device *dev, *aux; + struct net *dest_net; /* * Push all migratable network devices back to the * initial network namespace @@ -5844,12 +5845,13 @@ static void __net_exit default_device_exit(struct net *net) continue; /* Leave virtual devices for the generic cleanup */ - if (dev->rtnl_link_ops) + if (dev->rtnl_link_ops && dev->nd_net == dev->nd_init_net) continue; /* Push remaing network devices to init_net */ + dest_net = dev->rtnl_link_ops ? dev->nd_init_net : &init_net; snprintf(fb_name, IFNAMSIZ, "dev%d", dev->ifindex); - err = dev_change_net_namespace(dev, &init_net, fb_name); + err = dev_change_net_namespace(dev, dest_net, fb_name); if (err) { printk(KERN_EMERG "%s: failed to move %s to init_net: %d\n", __func__, dev->name, err); diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 19bedd5..b2e3155 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -1394,6 +1394,7 @@ struct net_device *rtnl_create_link(struct net *src_net, struct net *net, goto err; dev_net_set(dev, net); + dev->nd_init_net = dev_net(dev); dev->rtnl_link_ops = ops; dev->rtnl_link_state = RTNL_LINK_INITIALIZING; dev->real_num_tx_queues = real_num_queues;
_______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers