There has been a long-standing (and documented) issue with seccomp where ptrace can be used to change a syscall out from under seccomp. This is a problem for containers and other wider seccomp filtered environments where ptrace needs to remain available, as it allows for an escape of the seccomp filter. Since the ptrace attack surface is available for any allowed syscall, moving seccomp after ptrace doesn't increase the actually available attack surface. And this actually improves tracing since, for example, tracers will be notified of syscall entry before seccomp sends a SIGSYS, which makes debugging filters much easier. The per-architecture changes do make one (hopefully small) semantic change, which is that since ptrace comes first, it may request a syscall be skipped. Running seccomp after this doesn't make sense, so if ptrace wants to skip a syscall, it will bail out early similarly to how seccomp was. This means that skipped syscalls will not be fed through audit, though that likely means we're actually avoiding noise this way. This series first cleans up seccomp to remove the now unneeded two-phase entry, fixes the SECCOMP_RET_TRACE hole (same as the ptrace hole above), and then reorders seccomp after ptrace on each architecture. Thanks, -Kees -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html