OL> What about leak detection ? OL> Aren't we missing {netns,netdev}_users()? This is something I need to give more thought to, but it's not as easy as it sounds. Network devices aren't released at the last put() like a lot of other things, and my initial attempts to reconcile the refcount after a checkpoint operation have not been successful. However, I'm not sure that it's as important here, because AFAIK, a network device can only exist in one network namespace at a time. If we're checkpointing a netdev, it's because we are checkpointing the namespace that it lives in. Making sure the netns isn't leaked out of the process tree would be much easier and just as effective, no? >> +config CHECKPOINT_NETNS >> + bool >> + default y if NET && NET_NS && CHECKPOINT >> + OL> Did you mean this to be visible (settable) by the user ? No, it was specifically supposed to enable itself when those other items are enabled, but not be a user adjustable toggle. I had a discussion with Serge about it and we came to this as a solution, although I don't remember what the problem we started with was. I'll dig through my IRC logs to see if I can figure it out. >> + retry: >> + if (++pages > 4) { >> + addrs = -E2BIG; >> + goto out; >> + } OL> Why 4 ? It's just a sanity limit. OL> Do we really need this special case ? I'd be happy with a ckpt_err() OL> for any error - and the actual error number would be useful to tell OL> which case it was. Unless I'm missing something, you asked for this specifically: https://lists.linux-foundation.org/pipermail/containers/2010-February/022844.html OL> Isn't this check redundant ? I expect it to fail promptly in OL> checkpoint_netdev() above. No, as I've said a couple of times previously, this isn't the only way we can arrive at a netdev for checkpointing. This case is the one where we're marching through the netns and find a netdev that is not supported. The other is where we arrive at a device as a peer of another device, so the other check may come into play at times where this one doesn't and vice versa. OL> This may be a bit simpler if you move the first deferqueue_add() OL> forward to just before the other one. Or better: change dq_netdev OL> to have two pointers, dev and peer (if any is null, the cleanup OL> function will skip). The reason it is this messy is because of the way network devices are deallocated. Since they don't destroy themselves on the final put(), we have to explicitly call unregister_netdev() on them when we know they're no longer used (else we block). Once we've added them to the deferqueue, we can no longer destroy them here because a reference is held and the deferqueue will run afterwards. The ordering of this is a result of me injecting failures at each step and working it out until I got it to not block on unregistering either of the devices in all of the error paths. That's not to say it's the best way, but there is a reason it's ordered the way it is. -- Dan Smith IBM Linux Technology Center email: danms@xxxxxxxxxx _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers