Quoting Sukadev Bhattiprolu (sukadev@xxxxxxxxxxxxxxxxxx): > > Subject: [RFC][v4][PATCH 7/7]: Define clone_extended() syscall > > Container restart requires that a task have the same pid it had when it was > checkpointed. When containers are nested the tasks within the containers > exist in multiple pid namespaces and hence have multiple pids to specify > during restart. > > This patch defines, a new system call, clone_extended() which is like clone(), > but takes a new 'pid_set' parameter. This parameter lets caller choose > specific pid numbers for the child process, in the process's active and > ancestor pid namespaces. (Descendant pid namespaces in general don't matter > since processes don't have pids in them anyway, but see comments in > copy_target_pids() regarding CLONE_NEWPID). > > Unlike clone(), however, clone_extended() needs CAP_SYS_ADMIN, at least for > now, to prevent unprivileged processes from misusing this interface. It only needs that when specifying pids. > While the main motivation for this interface is the need to let a process > choose its 'pid numbers', the clone_extended() interface uses 64-bit clone > flags. The 'higher' portion of the clone flags are unused and are only > included to preclude yet another version of clone when a new clone flag is > needed. > > ===== Interface: > > Compared to clone(), clone_extended() needs to pass in three more pieces > of information: > > - additional 32-bit of clone_flags > - number of pids in the set > - user buffer containing the list of pids. > > But since clone() already takes 5 parameters and some (all ?) architectures > are restricted to 6 parameters to a system-call, additional data-structures > (and copy_from_user()) are needed. > > The proposed interface for clone_extended() is: > > struct clone_tid_info { > void *parent_tid; /* parent_tid_ptr parameter */ > void *child_tid; /* child_tid_ptr parameter */ > }; > > struct pid_set { > int num_pids; > pid_t *pids; > }; > > int clone_extended(int flags_low, int flags_high, void *child_stack, > void *unused, struct clone_tid_info *tid_ptrs, > struct pid_set *pid_setp); I was thinking additional flags would be passed in the (renamed) struct pid_set. -serge _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers