(Oops, forgot to send this series through the lsm list...) On Thu, Jun 9, 2016 at 2:01 PM, Kees Cook <keescook@xxxxxxxxxxxx> wrote: > There has been a long-standing (and documented) issue with seccomp > where ptrace can be used to change a syscall out from under seccomp. > This is a problem for containers and other wider seccomp filtered > environments where ptrace needs to remain available, as it allows > for an escape of the seccomp filter. > > Since the ptrace attack surface is available for any allowed syscall, > moving seccomp after ptrace doesn't increase the actually available > attack surface. And this actually improves tracing since, for > example, tracers will be notified of syscall entry before seccomp > sends a SIGSYS, which makes debugging filters much easier. > > The per-architecture changes do make one (hopefully small) > semantic change, which is that since ptrace comes first, it may > request a syscall be skipped. Running seccomp after this doesn't > make sense, so if ptrace wants to skip a syscall, it will bail > out early similarly to how seccomp was. This means that skipped > syscalls will not be fed through audit, though that likely means > we're actually avoiding noise this way. > > This series first cleans up seccomp to remove the now unneeded > two-phase entry, fixes the SECCOMP_RET_TRACE hole (same as the > ptrace hole above), and then reorders seccomp after ptrace on > each architecture. Has anyone else had a chance to review this series? I'd like to get it landed in -next as early as possible in case there are unexpected problems... -Kees -- Kees Cook Chrome OS & Brillo Security