Re: [RFC] Proposal: Static SECCOMP Policies

Andy Lutomirski <luto@xxxxxxxxxxxxxx> · Fri, 13 Sep 2024 21:18:58 -0700

On Fri, Sep 13, 2024 at 10:30 AM Maxwell Bland <mbland@xxxxxxxxxxxx> wrote:
>
> On Fri, Sep 13, 2024 at 05:07:46PM GMT, Maxwell Bland wrote:
>
> > These programs will not print out using PTRACE and are difficult to audit
> > without patching the seccomp calls yourself because the ptrace call to
> > PTRACE_SECCOMP_GET_FILTER will fail. I believe (have not checked) because they
> > are not cBPF, and seccomp's logic makes prog->fprog evaluates to null despite
> > prog existing if it is cBPF, at least on Android 14. I spent a whole day
> > getting frustrated with the failing ptrace call before finally ending up my
> > patches (attached to the end) that instrument ptrace and can print the
> > programs.
>
> LOL, this paragraph is a mess, apologies: I'm referencing the failure of
> get_seccomp_filter in seccomp.c here:
>
> fprog = filter->prog->orig_prog;
> if (!fprog) {
>         /* This must be a new non-cBPF filter, since we save
>          * every cBPF filter's orig_prog above when
>          * CONFIG_CHECKPOINT_RESTORE is enabled.
>          */
>         ret = -EMEDIUMTYPE;
>         goto out;
> }
>
> Though CONFIG_CHECKPOINT_RESTORE is not set on Android 14, so I think
> the ptrace probably failed for all sorts of reasons unrelated to cBPF.
>
> But don't let me distract from the issue, which is that
> cBPF/eBPF/however these filters get allocated to machine code,
> bpf_int_jit_compile ends up getting called and a new
> privileged-executable page gets allocated without compile-time
> provenance (at least, without reverse engineering) for where that code
> came from.

Mulling over this a bit, I think there are sort of two issues here,
and they're sort of orthogonal to each other.

The easy one first: can there be a static or somewhat static or at
least administrator-controlled list of seccomp cBPF programs?  (Where
administrator is, sadly, probably not the actual owner of a phone, but
that ship sailed a long time ago.). Trying to make a list *and
reference that list from programs loading filters* seems like a huge
breaking change, not to mention that getting it to work right in
namespaces will be extra complex.

But what if there was a mechanism to *cryptographically hash* a BPF
program as part of the loading process?  Then that hash could be
looked up in a list, and a decision could be made based on the result?
 Would this help solve any problems?

Okay, on to the hard part: code integrity.  I've mulled over this a
bit from the perspective of userspace JITs and their interaction with
kernel-enforced security.  Kernel-based JITs and their interactions
with hypervisor security are rather similar.  (They're *not* the same.
The kernel can and does muck with its own pagetables.  User code
can't.  But I don't think this is a huge difference here as to the big
picture.)  There's also self-modifying code (existing executable code
that changes) and code generation (code that is created where code
previously didn't exist).  I'm going to focus on the latter.

Today, userspace can use nasty APIs to allocate writable memory, then
write to it, then change it to be executable.  This comes with gnarly
architecture-specific coherency issues, and it doesn't give a great
way for the kernel to render an intelligent opinion.  And, today, the
kernel can allocate memory (by futzing with pagetables or just using
existing maps), write some code, then either change the permissions to
executable or create a new executable alias, and then do the
architecture-specific incantation to make it coherent, then run it.
In neither case is there an amazing way for the supervisor (kernel or
hypervisor) to render an opinion about the code, and in the userspace
case, the actual efficiency of the process is quite low.

So what would a good solution look like?  It seem to me that the
program being supervised (a userspace or kernel JIT) could generate
some kind of data structure along these lines:

- machine code to be materialized

- address and length at which to materialize it (probably
page-aligned, but maybe not)

- an "origin" of this code (perhaps a file handle?) -- I'm not 100%
sure this is useful

- a "justification" for the code.  This could be something like "Hey,
this is JITted from cBPF for seccomp, and here's the cBPF".

Or there could be a more indirect variant:

- source to be JITed (cBPF, WASM, eBPF, whatever)

- enough relocation info for the supervisor to JIT it appropriately

- address to materialize the code at, along with maximum size

and the supervisor JITs it and materializes it.

I could imagine this being used for userspace and for hypervisor-based
kernel integrity.  Does it do what's needed here if there was a
hypercall kind of like this?

I can also imagine this being considerably faster than what current
userspace does.  On x86, for example, the kernel could populate a page
with the JITted code, then map that page at an address where nothing
was previously mapped, and return to userspace, and userspace could
execute that code, even on a different CPU, with no heavyweight
serialization at all.  I think the only practical way on Linux today
to do this would be to create a memfd, use write(2) or similar to fill
in the code, then mmap it executable.  And to fight with LSMs to make
sure they allow it and to maybe seal it as read-only before mmapping
it.  That latter bit kind of kills it if the goal is to write a web
browser, though -- you don't really want a whole new memfd for each
javascript block that gets JITted.

Is any of this helpful?

--Andy