On Fri, Feb 6, 2015 at 12:12 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote: > On Fri, Feb 6, 2015 at 12:07 PM, Kees Cook <keescook@xxxxxxxxxxxx> wrote: >> On Fri, Feb 6, 2015 at 11:32 AM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote: >>> On Fri, Feb 6, 2015 at 11:23 AM, Kees Cook <keescook@xxxxxxxxxxxx> wrote: >>>> And especially since a ptracer >>>> can change syscalls during syscall-enter-stop to any syscall it wants, >>>> bypassing seccomp. This condition is already documented. >>> >>> If a ptracer (using PTRACE_SYSCALL) were to get the entry callback >>> before seccomp, then this oddity would go away, which might be a good >>> thing. A ptracer could change the syscall, but seccomp would based on >>> what the ptracer changed the syscall to. >> >> I want kill events to trigger immediately. I don't want to leave the >> ptrace surface available on a SECCOMP_RET_KILL. So maybe it can be >> seccomp phase 1, then ptrace, then seccomp phase 2? And pass more >> information between phases to determine how things should behave >> beyond just "skip"? > > I thought so too, originally, but I'm far less convinced now, for two reasons: > > 1. I think that a lot of filters these days use RET_ERRNO heavily, so > this won't benefit them. > > 2. I'm not convinced it really reduces the attack surface for anyone. > Unless your filter is literally "return SECCOMP_RET_KILL", then the > seccomp-filtered task can always cause the ptracer to get a pair of > syscall notifications. Also, the task can send itself signals (using > page faults, breakpoints, etc) and cause ptrace events via other > paths. What are you thinking for a solution? As for capping SECCOMP_RET_ERRNO to MAX_ERRNO, how about this (sorry if gmail butchers the paste): diff --git a/kernel/seccomp.c b/kernel/seccomp.c index 4ef9687ac115..c88148d20bd5 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -629,7 +629,9 @@ static u32 __seccomp_phase1_filter(int this_syscall, struct switch (action) { case SECCOMP_RET_ERRNO: - /* Set the low-order 16-bits as a errno. */ + /* Set the low-order bits as a errno. */ + if (data > MAX_ERRNO) + data = MAX_ERRNO; syscall_set_return_value(current, task_pt_regs(current), -data, 0); goto skip; -- Kees Cook Chrome OS Security -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html