On Wed, Jun 25, 2014 at 7:21 AM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote: > On 06/24, Kees Cook wrote: >> >> +static void seccomp_sync_threads(void) >> +{ >> + struct task_struct *thread, *caller; >> + >> + BUG_ON(!spin_is_locked(¤t->sighand->siglock)); >> + >> + /* Synchronize all threads. */ >> + caller = current; >> + for_each_thread(caller, thread) { >> + /* Get a task reference for the new leaf node. */ >> + get_seccomp_filter(caller); >> + /* >> + * Drop the task reference to the shared ancestor since >> + * current's path will hold a reference. (This also >> + * allows a put before the assignment.) >> + */ >> + put_seccomp_filter(thread); >> + thread->seccomp.filter = caller->seccomp.filter; >> + /* Opt the other thread into seccomp if needed. >> + * As threads are considered to be trust-realm >> + * equivalent (see ptrace_may_access), it is safe to >> + * allow one thread to transition the other. >> + */ >> + if (thread->seccomp.mode == SECCOMP_MODE_DISABLED) { >> + /* >> + * Don't let an unprivileged task work around >> + * the no_new_privs restriction by creating >> + * a thread that sets it up, enters seccomp, >> + * then dies. >> + */ >> + if (task_no_new_privs(caller)) >> + task_set_no_new_privs(thread); >> + >> + seccomp_assign_mode(thread, SECCOMP_MODE_FILTER); >> + } >> + } >> +} > > OK, personally I think this all make sense. I even think that perhaps > SECCOMP_FILTER_FLAG_TSYNC should allow filter == NULL, a thread might > want to "sync" without adding the new filter, but this is minor/offtopic. > > But. Doesn't this change add a new security hole? > > Obviously, we should not allow to install a filter and then (say) exec > a suid binary, that is why we have no_new_privs/LSM_UNSAFE_NO_NEW_PRIVS. > > But what if "thread->seccomp.filter = caller->seccomp.filter" races with > any user of task_no_new_privs() ? Say, suppose this thread has already > passed check_unsafe_exec/etc and it is going to exec the suid binary? Oh, ew. Yeah. It looks like there's a cred lock to be held to combat this? I wonder if changes to nnp need to "flushed" during syscall entry instead of getting updated externally/asynchronously? That way it won't be out of sync with the seccomp mode/filters. Perhaps secure computing needs to check some (maybe seccomp-only) atomic flags and flip on the "real" nnp if found? -Kees -- Kees Cook Chrome OS Security -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html