From: "Daniel P. Berrange" <berrange@xxxxxxxxxx> The following commit commit cf3f89214ef6a33fad60856bc5ffd7bb2fc4709b Author: Daniel Lezcano <daniel.lezcano@xxxxxxx> Date: Wed Mar 28 14:42:51 2012 -0700 pidns: add reboot_pid_ns() to handle the reboot syscall introduced custom handling of the reboot() syscall when invoked from a non-initial PID namespace. The intent was that a process in a container can be allowed to keep CAP_SYS_BOOT and execute reboot() to shutdown/reboot just their private container, rather than the host. Unfortunately the kexec_load() syscall also relies on the CAP_SYS_BOOT capability. So by allowing a container to keep this capability to safely invoke reboot(), they mistakenly also gain the ability to use kexec_load(). The solution is to make kexec_load() return -EPERM if invoked from a PID namespace that is not the initial namespace Signed-off-by: Daniel P. Berrange <berrange@xxxxxxxxxx> Cc: Serge Hallyn <serge.hallyn@xxxxxxxxxxxxx> Cc: Daniel Lezcano <daniel.lezcano@xxxxxxx> Cc: Michael Kerrisk <mtk.manpages@xxxxxxxxx> Cc: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx> Cc: Tejun Heo <tj@xxxxxxxxxx> Cc: Oleg Nesterov <oleg@xxxxxxxxxx> --- kernel/kexec.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/kernel/kexec.c b/kernel/kexec.c index 0668d58..b152bde 100644 --- a/kernel/kexec.c +++ b/kernel/kexec.c @@ -947,6 +947,11 @@ SYSCALL_DEFINE4(kexec_load, unsigned long, entry, unsigned long, nr_segments, if (!capable(CAP_SYS_BOOT)) return -EPERM; + /* Processes in containers must not be allowed to load a new + * kernel, even if they have CAP_SYS_BOOT */ + if (task_active_pid_ns(current) != &init_pid_ns) + return -EPERM; + /* * Verify we have a legal set of flags * This leaves us room for future extensions. -- 1.7.11.2 _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers