Serge E. Hallyn wrote: > Quoting Eric W. Biederman (ebiederm@xxxxxxxxxxxx): >> Ok. I see what you are trying to accomplish with this and honestly I >> think it is silly. >> >> We should start the threads we need in the kernel, and if we need to >> run clone_pid fine. I am not comfortable exporting clone_with_pid to >> user space. > > Even if we create the task tree in userspace, I don't see why we > can't have the parent of each nested pid_ns pass CLONE_NEWPID to > clone_with_pid() instead of doing clone first and then unsharing > the pidns? > > As for clone_with_pid(), I don't particularly like the semantics, > but as was discussed over IRC, we could have clone_with_pid() > return -EINVAL unless it is called while it is called from a task > inside a restarting container. (and -EPERM if setting a pid in > a pid_ns which was not created as part of the container) Eric > do you dislike that any less? Wouldn't this mean the kernel would have to track which namespaces are part of a restart and which aren't? Seems a little kludgy to me. > >> As for the implementation of allocating a struct pid with a certain >> set of pid values. I expect we can do that easily enough by >> refactoring the pid allocator to be passed in the min/max pid to >> allocate from, and have a special case that passes in a different set >> of min/max values so we can allocate just the pid we need. > > What is wrong with Alexey's patch, which simply passes in the values > themselves? Do you have another use in mind for the min/max pid > values? > >> If the primary use for a userspace interface is restart I feel we are >> doing it wrong. > > I think that's a good guideline, bad rule. Certainly possible > that you're right that this is just pointing to in-kernel > recreation of process tree as the way to go. I was getting > that feeling myself, but then there are still very good reasons > not to do that, as there are things which each task should do > before completing sys_restart() which are best done in userspace. > These include for instance creating virtual nics, and calling > Oren's suggested 'cr_advise()' system calls. > > -serge _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers