Hi Gavin, On 03/03/2025 07:08, Gavin Shan wrote: > On 2/14/25 2:13 AM, Steven Price wrote: >> The RMM maintains a data structure known as the Realm Execution Context >> (or REC). It is similar to struct kvm_vcpu and tracks the state of the >> virtual CPUs. KVM must delegate memory and request the structures are >> created when vCPUs are created, and suitably tear down on destruction. >> >> RECs must also be supplied with addition pages - auxiliary (or AUX) >> granules - for storing the larger registers state (e.g. for SVE). The >> number of AUX granules for a REC depends on the parameters with which >> the Realm was created - the RMM makes this information available via the >> RMI_REC_AUX_COUNT call performed after creating the Realm Descriptor >> (RD). >> >> Note that only some of register state for the REC can be set by KVM, the >> rest is defined by the RMM (zeroed). The register state then cannot be >> changed by KVM after the REC is created (except when the guest >> explicitly requests this e.g. by performing a PSCI call). The RMM also >> requires that the VMM creates RECs in ascending order of the MPIDR. >> >> See Realm Management Monitor specification (DEN0137) for more >> information: >> https://developer.arm.com/documentation/den0137/ >> >> Signed-off-by: Steven Price <steven.price@xxxxxxx> >> --- >> Changes since v6: >> * Avoid reporting the KVM_ARM_VCPU_REC feature if the guest isn't a >> realm guest. >> * Support host page size being larger than RMM's granule size when >> allocating/freeing aux granules. >> Changes since v5: >> * Separate the concept of vcpu_is_rec() and >> kvm_arm_vcpu_rec_finalized() by using the KVM_ARM_VCPU_REC feature as >> the indication that the VCPU is a REC. >> Changes since v2: >> * Free rec->run earlier in kvm_destroy_realm() and adapt to previous >> patches. >> --- >> arch/arm64/include/asm/kvm_emulate.h | 7 ++ >> arch/arm64/include/asm/kvm_host.h | 3 + >> arch/arm64/include/asm/kvm_rme.h | 18 +++ >> arch/arm64/kvm/arm.c | 13 +- >> arch/arm64/kvm/reset.c | 11 ++ >> arch/arm64/kvm/rme.c | 179 +++++++++++++++++++++++++++ >> 6 files changed, 229 insertions(+), 2 deletions(-) >> > > With the following one comment addressed: > > Reviewed-by: Gavin Shan <gshan@xxxxxxxxxx> > > [...] > >> /* >> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/ >> asm/kvm_rme.h >> index 698bb48a8ae1..5db377943db4 100644 >> --- a/arch/arm64/include/asm/kvm_rme.h >> +++ b/arch/arm64/include/asm/kvm_rme.h >> @@ -6,6 +6,7 @@ >> #ifndef __ASM_KVM_RME_H >> #define __ASM_KVM_RME_H >> +#include <asm/rmi_smc.h> >> #include <uapi/linux/kvm.h> >> /** >> @@ -65,6 +66,21 @@ struct realm { >> unsigned int ia_bits; >> }; >> +/** >> + * struct realm_rec - Additional per VCPU data for a Realm >> + * >> + * @mpidr: MPIDR (Multiprocessor Affinity Register) value to identify >> this VCPU >> + * @rec_page: Kernel VA of the RMM's private page for this REC >> + * @aux_pages: Additional pages private to the RMM for this REC >> + * @run: Kernel VA of the RmiRecRun structure shared with the RMM >> + */ >> +struct realm_rec { >> + unsigned long mpidr; >> + void *rec_page; >> + struct page *aux_pages[REC_PARAMS_AUX_GRANULES]; >> + struct rec_run *run; >> +}; >> + > > REC_PARAMS_AUX_GRANULES represents the maximal number of the auxiliary > granules. > Since the base page size is always larger than or equal to granule size > (4KB). > The capacity of array @aux_pages[] needs to be REC_PARAMS_AUX_GRANULES. > Ideally, > the array's size can be computed dynamically and it's allocated in > kvm_create_rec(). This is indeed another example of where pages and granules have got mixed. The RMM specification provides a maximum number of granules (and there's a similar array in struct rec_params). Here the use of REC_PARAMS_AUX_GRANULES is just to keep the structure more simple (no dynamic allocation) with the cost of ~128bytes. Obviously if PAGE_SIZE>4k then this array could be smaller. > Alternatively, to keep the code simple, a comment is needed here to > explain why > the array's size has been set to REC_PARAMS_AUX_GRANULES. Definitely a valid point - this could do with a comment explaining the situation. > An relevant question: Do we plan to support differentiated sizes between > page > and granule? I had the assumption this feature will be supported in the > future > after the base model (equal page and granule size) gets merged first. Yes I do plan to support it. This series actually has the basic support in it already, with an experimental patch at the end enabling that support. It "works" but I haven't tested it well and I think some of the error handling isn't quite right yet. But there's also a bunch more work to be done (like here) to avoid over-allocating memory when PAGE_SIZE>4k. E.g. RTTs need an sub-allocator so that we don't waste an entire (larger) page on each RTT. Steve