Re: [RFC PATCH v3 00/15] pkeys-based page table hardening

Kees Cook <kees@xxxxxxxxxx> · Thu, 6 Feb 2025 14:41:30 -0800

On Mon, Feb 03, 2025 at 10:18:24AM +0000, Kevin Brodsky wrote:
> This is a proposal to leverage protection keys (pkeys) to harden
> critical kernel data, by making it mostly read-only. The series includes
> a simple framework called "kpkeys" to manipulate pkeys for in-kernel use,
> as well as a page table hardening feature based on that framework
> (kpkeys_hardened_pgtables). Both are implemented on arm64 as a proof of
> concept, but they are designed to be compatible with any architecture
> implementing pkeys.

Does QEMU support POE? The only mention I could find is here:
https://mail.gnu.org/archive/html/qemu-arm/2024-03/msg00486.html
where the answer is, "no and it looks difficult". :P

> # Threat model
> 
> The proposed scheme aims at mitigating data-only attacks (e.g.
> use-after-free/cross-cache attacks). In other words, it is assumed that
> control flow is not corrupted, and that the attacker does not achieve
> arbitrary code execution. Nothing prevents the pkey register from being
> set to its most permissive state - the assumption is that the register
> is only modified on legitimate code paths.

Do you have any tests that could be added to drivers/misc/lkdtm that
explicitly exercise the protection? That is where many hardware security
features get tested. (i.e. a successful test will generally trigger a
BUG_ON or similar.)

> The arm64 implementation should be considered a proof of concept only.
> The enablement of POE for in-kernel use is incomplete; in particular
> POR_EL1 (pkey register) should be reset on exception entry and restored
> on exception return.

As in, make sure the loaded pkey isn't leaked into an exception handler?

> # Open questions
> 
> A few aspects in this RFC that are debatable and/or worth discussing:
> 
> - There is currently no restriction on how kpkeys levels map to pkeys
>   permissions. A typical approach is to allocate one pkey per level and
>   make it writable at that level only. As the number of levels
>   increases, we may however run out of pkeys, especially on arm64 (just
>   8 pkeys with POE). Depending on the use-cases, it may be acceptable to
>   use the same pkey for the data associated to multiple levels.
> 
>   Another potential concern is that a given piece of code may require
>   write access to multiple privileged pkeys. This could be addressed by
>   introducing a notion of hierarchy in trust levels, where Tn is able to
>   write to memory owned by Tm if n >= m, for instance.
> 
> - kpkeys_set_level() and kpkeys_restore_pkey_reg() are not symmetric:
>   the former takes a kpkeys level and returns a pkey register value, to
>   be consumed by the latter. It would be more intuitive to manipulate
>   kpkeys levels only. However this assumes that there is a 1:1 mapping
>   between kpkeys levels and pkey register values, while in principle
>   the mapping is 1:n (certain pkeys may be used outside the kpkeys
>   framework).

Is the "levels" nature of this related to how POE behaves? It sounds
like there can only be 1 pkey active at a time (a role), rather than
each pkey representing access to a specific set of pages (a key in a
keyring), where many pkeys could be active at the same time. Am I
understanding that correctly?

> Any comment or feedback will be highly appreciated, be it on the
> high-level approach or implementation choices!

As hinted earlier with my QEMU question... what's the best way I can I
test this myself? :)

Thanks for working on this! Data-only attacks have been on the rise for
a while now, and I'm excited to see some viable mitigations appearing.
Yay!

-Kees

-- 
Kees Cook