On Mon, Sep 30, 2024 at 11:22:22AM GMT, Sebastian Ene wrote: > On Wed, Sep 25, 2024 at 12:53:11PM -0700, 'Maciej Żenczykowski' via kernel-team wrote: > > On Wed, Sep 25, 2024 at 12:52 PM Maciej Żenczykowski <maze@xxxxxxxxxx> wrote: > > > > > > On Wed, Sep 25, 2024 at 11:16 AM Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote: > > > > > > > > On Tue, Sep 17, 2024 at 8:08 AM Maxwell Bland <mbland@xxxxxxxxxxxx> wrote: > > > > > > > > > > On Fri, Sep 13, 2024 at 09:18:58PM GMT, Andy Lutomirski wrote: > > > > > > On Fri, Sep 13, 2024 at 10:30 AM Maxwell Bland <mbland@xxxxxxxxxxxx> wrote: > > > > > > > On Fri, Sep 13, 2024 at 05:07:46PM GMT, Maxwell Bland wrote: > > > > > > > > > > > > > > But don't let me distract from the issue, which is that > > > > > > > cBPF/eBPF/however these filters get allocated to machine code, > > > > > > > bpf_int_jit_compile ends up getting called and a new > > > > > > > privileged-executable page gets allocated without compile-time > > > > > > > provenance (at least, without reverse engineering) for where that code > > > > > > > came from. > > > > > > > > > > > > But what if there was a mechanism to *cryptographically hash* a BPF > > > > > > program as part of the loading process? Then that hash could be > > > > > > looked up in a list, and a decision could be made based on the result? > > > > > > Would this help solve any problems? > > > > > > > > > > The issue I have seen in the prior Qualys linked exploit from my initial > > > > > message and from talks by security researchers elsewhere, for example > > > > > Google Project Zero's recent "Analyzing a Modern In-the-wild Android > > > > > Exploit" by Seth Jenkins, is that people have the ability to target > > > > > these pages during the window between the page being allocated as > > > > > writable by vmalloc.c and the update to the PTE which makes it > > > > > executable, so a signature does help (creates the requirement of more > > > > > than one write to commit "forgery"), but doesn't totally 100% solve the > > > > > problem. > > > > > > > > > > Right now, every time I open up chrome on our latest flagship the > > > > > browsers sandbox filters trigger my EL2 monitor because they are > > > > > attempting to follow the standard W^X protocol. If I were to build one > > > > > of these exploits, I'd: > > > > > > > > > > (1) find out a non-crashing leak for code page and data values > > > > > (2) determine from vmalloc's rb-tree where the next one-page allocation > > > > > is likely to occur > > > > > (3) prime my write gadget for an offset into that page > > > > > (4) spin up chrome in a second thread > > > > > (5) attempt to trigger a write (or two) at the right precise time using > > > > > prior empirical measurement or my read gadget for kernel mem > > > > > > > > > > Which is messy, but people have been known to do more given good enough > > > > > stakes. Hell, I spent a few months working on something similar for > > > > > airplane communication management units. > > > > > > > > My vague proposal for a "better JIT API" (which you quoted below) > > > > explicitly and completely solves this problem: > > > > > > > > > > > > > > > So what would a good solution look like? It seem to me that the > > > > > > program being supervised (a userspace or kernel JIT) could generate > > > > > > some kind of data structure along these lines: > > > > > > > > > > > > - machine code to be materialized > > > > > > > > > > > > - address and length at which to materialize it (probably > > > > > > page-aligned, but maybe not) > > > > > > > > > > > > - an "origin" of this code (perhaps a file handle?) -- I'm not 100% > > > > > > sure this is useful > > > > > > > > > > > > - a "justification" for the code. This could be something like "Hey, > > > > > > this is JITted from cBPF for seccomp, and here's the cBPF". > > > > > > > > Even ignoring the origin and justification parts, there's no WX window > > > > in here. The code is generated, then it's shipped off to the > > > > hypervisor/supervisor, and *exactly that code* is materialized !W, X. > > > > > > > > Of course, this still leaves verification to be handled. > > > > > > > > > Returning to the idea of origins, at the end of the work day yesterday I > > > > > queried Maciej to "have Android choose one compiler for seccomp policies > > > > > to BPF and stick with it", because if I knew filters were chosen by > > > > > libminijail or some other userspace system, I could pretty easily figure > > > > > out what EL2 needs to expect at runtime. An "origin" field would be > > > > > equally as effective, and retain flexibility. > > > > > > > > At the risk of a silly suggestion, what if the entire JIT compiler and > > > > verifier (or a sufficient portion) were, itself, a WASM (or similar) > > > > program, signed or whatever, and shipped off to the hypervisor? The > > > > hypervisor could run it (in whatever sandbox it likes -- hypervisors > > > > are capable of spawning a separate VM to host it if needed), and only > > > > then accept the output. > > > > > > > > I, personally, think that this is of extremely dubious value unless > > > > it's paired with a control flow integrity system. But maybe it could > > > > be! Something like x86 IBT would be a start, and FineIBT would be > > > > better, as would an ARM equivalent. > > > > > > > > --Andy > > > > > Hi, > > In response to your previous message (this is Seb from pKVM team): > > > > > I've heard rumours (probably read some LWN article perhaps > > > https://lwn.net/Articles/836693/ ) that protected kvm for Android has > > > some mechanism to start the kernel in some higher priv level (EL2?), > > > then move most of it to EL1 while keeping a protected VPN shim in EL2. > > > > s/VPN/KVM/ > > Yes we do initialize the pKVM hypervisor at EL2 fairly early at > device_initcall_sync (initcall 5) before we depriviledge the rest of the > kernel at EL1. > Implementing code page integrity checks in pKVM as a reference spec for all the other EL2 developers and the kernel to "do the right thing" for hypervisor-based exploit prevention and kernel integrity checking would be a major success for ARM/Google. I am hoping I can get Moto to release our code. > > > > > > > > Perhaps the answer is to leave the bpf verifier + jit compiler in EL2? > > > > What are the gains to move this at EL2 ? I am a bit late to this party. > We don't have any init at that stage because it is too early. We do > support some EL2 vendor modules loading from a ramdisk but this is a > different story. > I see moving the full JIT/verifier into EL2 as problematic because of increased threat surface. We've seen many project zero originated and third-party exploits targeting EL2 SMC interfaces on Android: *cough* a certain galactic-themed phone manufacturer's claims to have a system protecting these code pages, who never seemed to mention the complications seccomp creates, let alone the impossibility of filtering page table updates on snapdragon chipsets without reworking vmalloc infrastructure in what must be a GPL-2.0 compliant interface they never made open source, had serious SMC-call based CVEs in the past *cough* https://project-zero.issues.chromium.org/issues/42452502 *cough*