On Wed, Jun 25, 2014 at 11:09 AM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote: > On Wed, Jun 25, 2014 at 10:57 AM, Kees Cook <keescook@xxxxxxxxxxxx> wrote: >> On Wed, Jun 25, 2014 at 10:24 AM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote: >>> On 06/25, Kees Cook wrote: >>>> >>>> On Wed, Jun 25, 2014 at 9:52 AM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote: >>>> > >>>> > Yes, at least this should close the race with suid-exec. And there are no >>>> > other users. Except apparmor, and I hope you will check it because I simply >>>> > do not know what it does ;) >>>> > >>>> >> I wonder if changes to nnp need to "flushed" during syscall entry >>>> >> instead of getting updated externally/asynchronously? That way it >>>> >> won't be out of sync with the seccomp mode/filters. >>>> >> >>>> >> Perhaps secure computing needs to check some (maybe seccomp-only) >>>> >> atomic flags and flip on the "real" nnp if found? >>>> > >>>> > Not sure I understand you, could you clarify? >>>> >>>> Instead of having TSYNC change the nnp bit, it can set a new flag, say: >>>> >>>> task->seccomp.flags |= SECCOMP_NEEDS_NNP; >>>> >>>> This would be set along with seccomp.mode, seccomp.filter, and >>>> TIF_SECCOMP. Then, during the next secure_computing() call that thread >>>> makes, it would check the flag: >>>> >>>> if (task->seccomp.flags & SECCOMP_NEEDS_NNP) >>>> task->nnp = 1; >>>> >>>> This means that nnp couldn't change in the middle of a running syscall. >>> >>> Aha, so you were worried about the same thing. Not sure we need this, >>> but at least I understand you and... >>> >>>> Hmmm. Perhaps this doesn't solve anything, though? Perhaps my proposal >>>> above would actually make things worse, since now we'd have a thread >>>> with seccomp set up, and no nnp. If it was in the middle of exec, >>>> we're still causing a problem. >>> >>> Yes ;) >>> >>>> I think we'd also need a way to either delay the seccomp changes, or >>>> to notice this condition during exec. Bleh. >>> >>> Hmm. confused again, >> >> I mean to suggest that the tsync changes would be stored in each >> thread, but somewhere other than the true seccomp struct, but with >> TIF_SECCOMP set. When entering secure_computing(), current would check >> for the "changes to sync", and apply them, then start the syscall. In >> this way, we can never race a syscall (like exec). > > I'm not sure that helps. If you set a pending filter part-way through > exec, and exec copies that pending filter but doesn't notice NNP, then > there's an exploitable race. > >> >>>> What actually happens with a multi-threaded process calls exec? I >>>> assume all the other threads are destroyed? >>> >>> Yes. But this is the point-of-no-return, de_thread() is called after the execing >>> thared has already passed (say) check_unsafe_exec(). >>> >>> However, do_execve() takes cred_guard_mutex at the start in prepare_bprm_creds() >>> and drops it in install_exec_creds(), so it should solve the problem? >> >> I can't tell yet. I'm still trying to understand the order of >> operations here. It looks like de_thread() takes the sighand lock. >> do_execve_common does: >> >> prepare_bprm_creds (takes cred_guard_mutex) >> check_unsafe_exec (checks nnp to set LSM_UNSAFE_NO_NEW_PRIVS) >> prepare_binprm (handles suid escalation, checks nnp separately) >> security_bprm_set_creds (checks LSM_UNSAFE_NO_NEW_PRIVS) >> exec_binprm >> load_elf_binary >> flush_old_exec >> de_thread (takes and releases sighand->lock) >> install_exec_creds (releases cred_guard_mutex) >> >> I don't see a way to use cred_guard_mutex during tsync (which holds >> sighand->lock) without dead-locking. What were you considering here? > > Grab cred_guard_mutex and then sighand->lock, perhaps? Ah, yes, task->signal is like sighand: shared across all threads. -- Kees Cook Chrome OS Security