Sukadev Bhattiprolu wrote: > Oren Laadan [orenl@xxxxxxxxxxx] wrote: > | > | > | Sukadev Bhattiprolu wrote: > | > === NEW CLONE() SYSTEM CALL: > | > > | > To support application checkpoint/restart, a task must have the same pid it > | > had when it was checkpointed. When containers are nested, the tasks within > | > the containers exist in multiple pid namespaces and hence have multiple pids > | > to specify during restart. > | > > | > This patchset implements a new system call, clone2() that lets a process > | > specify the pids of the child process. > | > > | > Patches 1 through 6 are helper patches, needed for choosing a pid for the > | > child process. > | > > | > Patch 8 defines a prototype of the new system call. Patch 9 adds some > | > documentation on the new system call, some/all of which will eventually > | > go into a man page. > | > > | > | [...] > | > | > > | > Based on these requirements and constraints, we explored a couple of system > | > call interfaces (in earlier versions of this patchset) and currently define > | > the system call as: > | > > | > struct clone_struct { > | > u64 flags; > | > u64 child_stack; > | > u32 nr_pids; > | > u32 parent_tid; > | > u32 child_tid; > | > | So @parent_tid and @child_tid are pointers to userspace memory and > | require 'u64' (and it won't hurt to make @reserved1 a 'u64' as well). > > Well, if we make parent_tid and child_tid u64, we could move reserved1 > after ->nr_pids and leave it as a 32-bit value. Sure. In any case, won't hurt to leave large reserved space - someone may be thankful for it in the future ;) > > | > | > u32 reserved1; > | > u64 reserved2; > | > }; > | > > | > | Also, for forward/backward compatibility, explicitly state in the > | documentation, and enforce in the kernel, that flags which are not > | defined must not be set, and that reserved{1,2} must remain 0. > > Agree with checking for reserved1 and reserved2. > > We currently don't check for invalid clone_flags - we just ignore them. > Adding checks like > > if (fls(kcs.flags) > fls(CLONE_LAST_FLAG)) > > would assume we always use bits in order (while it seems to make sense, to > use them in order, we don't seem to have done so in the past). > > Alternatively we could define a CLONE_FLAG_MASK of valid flags and update > the mask when each new clone flag is added. > > But do we really need to check for invalid flags ? I'd go for a a mask. The idea is that we want to educate userspace to _not_ use unused flags now. For if userspace sets an unused flag now and we let it be, the application will break when we give meaning to that flag. Oren. _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers