On 06/02/2025 23:41, Kees Cook wrote: > On Mon, Feb 03, 2025 at 10:18:24AM +0000, Kevin Brodsky wrote: >> This is a proposal to leverage protection keys (pkeys) to harden >> critical kernel data, by making it mostly read-only. The series includes >> a simple framework called "kpkeys" to manipulate pkeys for in-kernel use, >> as well as a page table hardening feature based on that framework >> (kpkeys_hardened_pgtables). Both are implemented on arm64 as a proof of >> concept, but they are designed to be compatible with any architecture >> implementing pkeys. > Does QEMU support POE? The only mention I could find is here: > https://mail.gnu.org/archive/html/qemu-arm/2024-03/msg00486.html > where the answer is, "no and it looks difficult". :P Unfortunately it looks like the answer hasn't changed since last year. I am testing this series on an Arm Fast Models platform (FVP) [1], which does support POE. I've included instructions to get you started at the end. >> # Threat model >> >> The proposed scheme aims at mitigating data-only attacks (e.g. >> use-after-free/cross-cache attacks). In other words, it is assumed that >> control flow is not corrupted, and that the attacker does not achieve >> arbitrary code execution. Nothing prevents the pkey register from being >> set to its most permissive state - the assumption is that the register >> is only modified on legitimate code paths. > Do you have any tests that could be added to drivers/misc/lkdtm that > explicitly exercise the protection? That is where many hardware security > features get tested. (i.e. a successful test will generally trigger a > BUG_ON or similar.) I could certainly add some tests there, but I wonder if such crash tests provide much benefit compared to the KUnit tests (that rely on copy_to_kernel_nofault()) in patch 15? Not crashing the kernel does mean that many of those tests can be run in a row :) >> The arm64 implementation should be considered a proof of concept only. >> The enablement of POE for in-kernel use is incomplete; in particular >> POR_EL1 (pkey register) should be reset on exception entry and restored >> on exception return. > As in, make sure the loaded pkey isn't leaked into an exception handler? I wouldn't say "leaking" is the issue here, but yes conceptually exception handlers should run with a fixed pkey configuration, not that of the interrupted context. As Dave Hansen pointed out [2], what is even more important is to context-switch the pkey register. A thread may be interrupted and scheduled out while executing at a higher kpkeys level; we want to ensure that this thread resumes execution at the same kpkeys level, and that in the meantime we return to the standard level. >> # Open questions >> >> A few aspects in this RFC that are debatable and/or worth discussing: >> >> - There is currently no restriction on how kpkeys levels map to pkeys >> permissions. A typical approach is to allocate one pkey per level and >> make it writable at that level only. As the number of levels >> increases, we may however run out of pkeys, especially on arm64 (just >> 8 pkeys with POE). Depending on the use-cases, it may be acceptable to >> use the same pkey for the data associated to multiple levels. >> >> Another potential concern is that a given piece of code may require >> write access to multiple privileged pkeys. This could be addressed by >> introducing a notion of hierarchy in trust levels, where Tn is able to >> write to memory owned by Tm if n >= m, for instance. >> >> - kpkeys_set_level() and kpkeys_restore_pkey_reg() are not symmetric: >> the former takes a kpkeys level and returns a pkey register value, to >> be consumed by the latter. It would be more intuitive to manipulate >> kpkeys levels only. However this assumes that there is a 1:1 mapping >> between kpkeys levels and pkey register values, while in principle >> the mapping is 1:n (certain pkeys may be used outside the kpkeys >> framework). > Is the "levels" nature of this related to how POE behaves? It sounds > like there can only be 1 pkey active at a time (a role), rather than > each pkey representing access to a specific set of pages (a key in a > keyring), where many pkeys could be active at the same time. Am I > understanding that correctly? Only one key is used (besides the default key) in this initial RFC. However, the idea behind the level abstraction is indeed that (RW) access to multiple keys may be required at the same time. In the follow-up RFC protecting credentials, this is illustrated by the "unrestricted" level that grants RW access to all keys. I believe this approach is the most flexible, in that any permission mapping can be defined for each level. >> Any comment or feedback will be highly appreciated, be it on the >> high-level approach or implementation choices! > As hinted earlier with my QEMU question... what's the best way I can I > test this myself? :) As mentioned above I tested this series on Arm FVP. By far the easiest way to run some custom kernel/rootfs on FVP is to use the Shrinkwrap tool [3]. First install it following the quick start guide [4] (I would recommend using the Docker backend if possible). Then build the firmware stack using: $ shrinkwrap build -o arch/v9.0.yaml ns-edk2.yaml To make things easy, the runtime configuration can be stored in a file. Create ~/.shrinkwrap/config/poe.yaml with the following contents: ----8<---- %YAML 1.2 --- layers: - arch/v9.0.yaml run: rtvars: CMDLINE: type: string # nr_cpus=1 can be added to speed up the boot value: console=ttyAMA0 earlycon=pl011,0x1c090000 root=/dev/vda rw params: -C cluster0.has_permission_overlay_s1: 1 -C cluster1.has_permission_overlay_s1: 1 ----8<---- Finally start FVP using: $ shrinkwrap run -o poe.yaml ns-edk2.yaml -r KERNEL=<out>/arch/arm64/boot/Image -r ROOTFS=<rootfs.img> (Use Ctrl-] to terminate the model if needed.) <rootfs.img> is a file containing the root filesystem (in raw format, e.g. ext4). The kernel itself is built as usual (defconfig works just fine), just make sure to select CONFIG_KPKEYS_HARDENED_PGTABLES to enable the feature. You can also select CONFIG_KPKEYS_HARDENED_PGTABLES_TEST to run the tests in patch 15. > Thanks for working on this! Data-only attacks have been on the rise for > a while now, and I'm excited to see some viable mitigations appearing. > Yay! Thank you for your interest and support, very appreciated! - Kevin [1] https://developer.arm.com/Tools%20and%20Software/Fixed%20Virtual%20Platforms/Arm%20Architecture%20FVPs [2] https://lore.kernel.org/linux-hardening/dcc1800c-cf0a-4d88-bc88-982f0709b382@xxxxxxxxx/ [3] https://shrinkwrap.docs.arm.com/ [4] https://shrinkwrap.docs.arm.com/en/latest/userguide/quickstart.html