On 25/05/2023 15:59, Mickaël Salaün wrote:
On 25/05/2023 00:20, Edgecombe, Rick P wrote:
On Fri, 2023-05-05 at 17:20 +0200, Mickaël Salaün wrote:
# How does it work?
This implementation mainly leverages KVM capabilities to control the
Second
Layer Address Translation (or the Two Dimensional Paging e.g.,
Intel's EPT or
AMD's RVI/NPT) and Mode Based Execution Control (Intel's MBEC)
introduced with
the Kaby Lake (7th generation) architecture. This allows to set
permissions on
memory pages in a complementary way to the guest kernel's managed
memory
permissions. Once these permissions are set, they are locked and
there is no
way back.
A first KVM_HC_LOCK_MEM_PAGE_RANGES hypercall enables the guest
kernel to lock
a set of its memory page ranges with either the HEKI_ATTR_MEM_NOWRITE
or the
HEKI_ATTR_MEM_EXEC attribute. The first one denies write access to a
specific
set of pages (allow-list approach), and the second only allows kernel
execution
for a set of pages (deny-list approach).
The current implementation sets the whole kernel's .rodata (i.e., any
const or
__ro_after_init variables, which includes critical security data such
as LSM
parameters) and .text sections as non-writable, and the .text section
is the
only one where kernel execution is allowed. This is possible thanks
to the new
MBEC support also brough by this series (otherwise the vDSO would
have to be
executable). Thanks to this hardware support (VT-x, EPT and MBEC),
the
performance impact of such guest protection is negligible.
The second KVM_HC_LOCK_CR_UPDATE hypercall enables guests to pin some
of its
CPU control register flags (e.g., X86_CR0_WP, X86_CR4_SMEP,
X86_CR4_SMAP),
which is another complementary hardening mechanism.
Heki can be enabled with the heki=1 boot command argument.
Can the guest kernel ask the host VMM's emulated devices to DMA into
the protected data? It should go through the host userspace mappings I
think, which don't care about EPT permissions. Or did I miss where you
are protecting that another way? There are a lot of easy ways to ask
the host to write to guest memory that don't involve the EPT. You
probably need to protect the host userspace mappings, and also the
places in KVM that kmap a GPA provided by the guest.
Good point, I'll check this confused deputy attack. Extended KVM
protections should indeed handle all ways to map guests' memory. I'm
wondering if current VMMs would gracefully handle such new restrictions
though.
I guess the host could map arbitrary data to the guest, so that need to
be handled, but how could the VMM (not the host kernel) bypass/update
EPT initially used for the guest (and potentially later mapped to the host)?