Re: [PATCH v6 6/9] kernel: entry: Support Syscall User Dispatch for common syscall entry

Christian Brauner <christian.brauner@xxxxxxxxxx> · Mon, 7 Sep 2020 16:25:10 +0200

On Mon, Sep 07, 2020 at 07:15:52AM -0700, Andy Lutomirski wrote:
> 
> 
> > On Sep 7, 2020, at 3:15 AM, Christian Brauner <christian.brauner@xxxxxxxxxx> wrote:
> > 
> > On Fri, Sep 04, 2020 at 04:31:44PM -0400, Gabriel Krisman Bertazi wrote:
> >> Syscall User Dispatch (SUD) must take precedence over seccomp, since the
> >> use case is emulation (it can be invoked with a different ABI) such that
> >> seccomp filtering by syscall number doesn't make sense in the first
> >> place.  In addition, either the syscall is dispatched back to userspace,
> >> in which case there is no resource for seccomp to protect, or the
> > 
> > Tbh, I'm torn here. I'm not a super clever attacker but it feels to me
> > that this is still at least a clever way to circumvent a seccomp
> > sandbox.
> > If I'd be confined by a seccomp profile that would cause me to be
> > SIGKILLed when I try do open() I could prctl() myself to do user
> > dispatch to prevent that from happening, no?
> > 
> 
> Not really, I think. The idea is that you didn’t actually do open().
> You did a SYSCALL instruction which meant something else, and the
> syscall dispatch correctly prevented the kernel from misinterpreting
> it as open().

Right, for the case where you're e.g. emulating windows syscalls that's
true. I was thinking when you're running natively on Linux: couldn't I
first load a seccomp profile "kill me if someone does an open()", then
I exec() the target binary and that binary is setup to do
prctl(USER_DISPATCH) first thing. I guess, it's ok because as far as I
had time to read it this is a nothing or all mechanism, i.e. _all_
system calls are re-routed in contrast to e.g. seccomp where I could do
this per-syscall. So for user-dispatch it wouldn't make sense to use it
on Linux per se. Still makes me a little uneasy. :)

Christian