On implementing of nested pid namespaces support in CRIU (checkpoint-restore in userspace tool) we run into the situation, that it's impossible to create a task with specific NSpid effectively. After commit 49f4d8b93ccf "pidns: Capture the user namespace and filter ns_last_pid" it is impossible to set ns_last_pid on any pid namespace, except task's active pid_ns (before the commit it was possible to set it for pid_ns_for_children). Thus, if a restored task in a container has more than one pid_ns levels, the restorer code must have a task helper for every pid namespace of the task's pid_ns hierarhy. This is a big problem, because of communication with a helper for every pid_ns in the hierarchy is not cheap and not performance-good. It implies many wakeups of helpers to create a single task (independently, how you communicate with the helpers). So, this patchset tries to decide the problem. It introduces a namespaces-specific ioctls and implements the realization for pid_ns, which allows to write a vector of last pids on pid_ns hierarchy. The vector is passed as a ":"-delimited string with pids, written in reverse order. The first number corresponds to the opened namespace ns_last_pid, the second is to its parent, etc. If you have the pid namespaces hierarchy like: pid_ns1 (grand father) | v pid_ns2 (father) | v pid_ns3 (child) and the ns of task's of pid_ns3 is open, then the corresponding vector will be "last_ns_pid3:last_ns_pid2:last_ns_pid1". This vector may be short and it may contain less levels, for example, "last_ns_pid3:last_ns_pid2" or even "last_ns_pid3", in dependence of which levels you want to populate. Numbers last_ns_pidX are just numbers written in decimal form. --- Kirill Tkhai (2): nsfs: Add namespace-specific ioctl (NS_SPECIFIC_IOC) pid_ns: Introduce ioctl to set vector of ns_last_pid's on ns hierarhy fs/nsfs.c | 4 ++ include/linux/proc_ns.h | 1 + include/uapi/linux/nsfs.h | 11 ++++++ kernel/pid_namespace.c | 88 +++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 104 insertions(+) -- Signed-off-by: Kirill Tkhai <ktkhai@xxxxxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html