Hi Sean, On Fri May 19, 2023 at 6:23 PM UTC, Sean Christopherson wrote: > On Fri, May 19, 2023, Nicolas Saenz Julienne wrote: > > Hi, > > On Fri Dec 2, 2022 at 6:13 AM UTC, Chao Peng wrote: [...] > > VSM introduces isolated guest execution contexts called Virtual Trust > > Levels (VTL) [2]. Each VTL has its own memory access protections, > > virtual processors states, interrupt controllers and overlay pages. VTLs > > are hierarchical and might enforce memory protections on less privileged > > VTLs. Memory protections are enforced on a per-GPA granularity. > > > > We implemented this in the past by using a separate address space per > > VTL and updating memory regions on protection changes. But having to > > update the memory slot layout for every permission change scales poorly, > > especially as we have to perform 100.000s of these operations at boot > > (see [1] for a little more context). > > > > I believe the biggest barrier for us to use memory attributes is not > > having the ability to target specific address spaces, or to the very > > least having some mechanism to maintain multiple independent layers of > > attributes. > > Can you elaborate on "specific address spaces"? In KVM, that usually means SMM, > but the VTL comment above makes me think you're talking about something entirely > different. E.g. can you provide a brief summary of the requirements/expectations? Let me refresh some concepts first. VTLs are vCPU modes implemented by the hypervisor. Lower VTLs switch into higher VTLs [1] through a hypercall or asynchronously through interrupts. Each VTL has its own CPU architectural state, lapic and MSR state (applies to only some MSRs). These are saved/restored when switching VTLS [2]. Additionally, VTLs share a common GPA->HPA mapping, but protection bits differ depending on which VTL the CPU is on. Privileged VTLs might revoke R/W/X(+MBEC, optional) access bits from lower VTLs on a per-GPA basis. In order to deal with the per-VTL memory protection bits, we extended the number of KVM address spaces and assigned one to each VTL. The hypervisor initializes all VTLs address spaces with the same mappings and protections, they are expected to diverge during runtime. Operations that rely on memory slots for GPA->HPA/HVA translations (including page faults) are already address space aware, so adding VTL support was fairly simple. Ultimately, when a privileged VTL enforces memory protections on lower VTLs we update that VTL's address space memory regions to reflect them. Protection changes are requested through a hypercall, which expects the new protection to be visible system wide upon returning from it. These hypercalls happen around 100000+ times during boot, so we introduced an "atomic memory slot update" API similar to Emanuele's [3] that allows splitting memory regions/changing permissions concurrent with other vCPUs. Now, if we had a way to map memory attributes to specific VTLs, we could use that instead. Actually, we wouldn't need to extend address spaces at all to support this (we might still need them to support Overlay Pages, but that's another story). Hope it makes a little more sense now. :) Nicolas [1] In practice we've only seen VTL0 and VTL1 being used. The spec supports up to 16 VTLs. [2] One can draw an analogy with arm's TrustZone. The hypervisor plays the role of EL3. Windows (VTL0) runs in Non-Secure (EL0/EL1) and the secure kernel (VTL1) in Secure World (EL1s/EL0s). [3] https://lore.kernel.org/all/20220909104506.738478-1-eesposit@xxxxxxxxxx/