On Sat, Jun 17, 2023 at 11:34:31AM -0400, Kent Overstreet wrote: > On Fri, Jun 16, 2023 at 09:13:22PM -0700, Andy Lutomirski wrote: > > On 5/16/23 14:20, Kent Overstreet wrote: > > > On Tue, May 16, 2023 at 02:02:11PM -0700, Kees Cook wrote: > > > > For something that small, why not use the text_poke API? > > > > > > This looks like it's meant for patching existing kernel text, which > > > isn't what I want - I'm generating new functions on the fly, one per > > > btree node. > > > > Dynamically generating code is a giant can of worms. > > > > Kees touched on a basic security thing: a linear address mapped W+X is a big > > no-no. And that's just scratching the surface -- ideally we would have a > > strong protocol for generating code: the code is generated in some > > extra-secure context, then it's made immutable and double-checked, then > > it becomes live. > > "Double checking" arbitrary code is is fantasy. You can't "prove the > security" of arbitrary code post compilation. I think there's a misunderstanding here about the threat model I'm interested in protecting against for JITs. While making sure the VM of a JIT is safe in itself, that's separate from what I'm concerned about. The threat model is about flaws _elsewhere_ in the kernel that can leverage the JIT machinery to convert a "write anything anywhere anytime" exploit primitive into an "execute anything" primitive. Arguments can be made to say "a write anything flaw means the total collapse of the security model so there's no point defending against it", but both that type of flaw and the slippery slope argument don't stand up well to real-world situations. The kinds of flaws we've seen are frequently limited in scope (write 1 byte, write only NULs, write only in a specific range, etc), but when chained together, the weakest link is what ultimately compromises the kernel. As such, "W^X" is a basic building block of the kernel's self-defense methods, because it is such a potent target for a write->execute attack upgrades. Since a JIT constructs something that will become executable, it needs to defend itself against stray writes from other threads. Since Linux doesn't (really) use per-CPU page tables, the workspace for a JIT can be targeted by something that isn't the JIT. To deal with this, JITs need to use 3 phases: a writing pass (into W memory), then switch it to RO and perform a verification pass (construct it again, but compare results to the RO version), and finally switch it executable. Or, it can use writes to memory that only the local CPU can perform (i.e. text_poke(), which uses a different set of page tables with different permissions). Without basic W^X, it becomes extremely difficult to build further defenses (e.g. protecting page tables themselves, etc) since WX will remain the easiest target. -Kees -- Kees Cook