On Wed, Jun 25, 2014 at 10:57 AM, Kees Cook <keescook@xxxxxxxxxxxx> wrote: > On Wed, Jun 25, 2014 at 10:24 AM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote: >> On 06/25, Kees Cook wrote: >>> >>> On Wed, Jun 25, 2014 at 9:52 AM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote: >>> > >>> > Yes, at least this should close the race with suid-exec. And there are no >>> > other users. Except apparmor, and I hope you will check it because I simply >>> > do not know what it does ;) >>> > >>> >> I wonder if changes to nnp need to "flushed" during syscall entry >>> >> instead of getting updated externally/asynchronously? That way it >>> >> won't be out of sync with the seccomp mode/filters. >>> >> >>> >> Perhaps secure computing needs to check some (maybe seccomp-only) >>> >> atomic flags and flip on the "real" nnp if found? >>> > >>> > Not sure I understand you, could you clarify? >>> >>> Instead of having TSYNC change the nnp bit, it can set a new flag, say: >>> >>> task->seccomp.flags |= SECCOMP_NEEDS_NNP; >>> >>> This would be set along with seccomp.mode, seccomp.filter, and >>> TIF_SECCOMP. Then, during the next secure_computing() call that thread >>> makes, it would check the flag: >>> >>> if (task->seccomp.flags & SECCOMP_NEEDS_NNP) >>> task->nnp = 1; >>> >>> This means that nnp couldn't change in the middle of a running syscall. >> >> Aha, so you were worried about the same thing. Not sure we need this, >> but at least I understand you and... >> >>> Hmmm. Perhaps this doesn't solve anything, though? Perhaps my proposal >>> above would actually make things worse, since now we'd have a thread >>> with seccomp set up, and no nnp. If it was in the middle of exec, >>> we're still causing a problem. >> >> Yes ;) >> >>> I think we'd also need a way to either delay the seccomp changes, or >>> to notice this condition during exec. Bleh. >> >> Hmm. confused again, > > I mean to suggest that the tsync changes would be stored in each > thread, but somewhere other than the true seccomp struct, but with > TIF_SECCOMP set. When entering secure_computing(), current would check > for the "changes to sync", and apply them, then start the syscall. In > this way, we can never race a syscall (like exec). I'm not sure that helps. If you set a pending filter part-way through exec, and exec copies that pending filter but doesn't notice NNP, then there's an exploitable race. > >>> What actually happens with a multi-threaded process calls exec? I >>> assume all the other threads are destroyed? >> >> Yes. But this is the point-of-no-return, de_thread() is called after the execing >> thared has already passed (say) check_unsafe_exec(). >> >> However, do_execve() takes cred_guard_mutex at the start in prepare_bprm_creds() >> and drops it in install_exec_creds(), so it should solve the problem? > > I can't tell yet. I'm still trying to understand the order of > operations here. It looks like de_thread() takes the sighand lock. > do_execve_common does: > > prepare_bprm_creds (takes cred_guard_mutex) > check_unsafe_exec (checks nnp to set LSM_UNSAFE_NO_NEW_PRIVS) > prepare_binprm (handles suid escalation, checks nnp separately) > security_bprm_set_creds (checks LSM_UNSAFE_NO_NEW_PRIVS) > exec_binprm > load_elf_binary > flush_old_exec > de_thread (takes and releases sighand->lock) > install_exec_creds (releases cred_guard_mutex) > > I don't see a way to use cred_guard_mutex during tsync (which holds > sighand->lock) without dead-locking. What were you considering here? > Grab cred_guard_mutex and then sighand->lock, perhaps? > -Kees > > -- > Kees Cook > Chrome OS Security -- Andy Lutomirski AMA Capital Management, LLC