On Wed, Nov 08, 2023, Sean Christopherson wrote: > On Wed, Nov 08, 2023, Nicolas Saenz Julienne wrote: > > This RFC series introduces the necessary infrastructure to emulate VSM > > enabled guests. It is a snapshot of the progress we made so far, and its > > main goal is to gather design feedback. > > Heh, then please provide an overview of the design, and ideally context and/or > justification for various design decisions. It doesn't need to be a proper design > doc, and you can certainly point at other documentation for explaining VSM/VTLs, > but a few paragraphs and/or verbose bullet points would go a long way. > > The documentation in patch 33 provides an explanation of VSM itself, and a little > insight into how userspace can utilize the KVM implementation. But the documentation > provides no explanation of the mechanics that KVM *developers* care about, e.g. > the use of memory attributes, how memory attributes are enforced, whether or not > an in-kernel local APIC is required, etc. > > Nor does the documentation explain *why*, e.g. why store a separate set of memory > attributes per VTL "device", which by the by is broken and unnecessary. After speed reading the series.. An overview of the design, why you made certain choices, and the tradeoffs between various options is definitely needed. A few questions off the top of my head: - What is the split between userspace and KVM? How did you arrive at that split? - How much *needs* to be in KVM? I.e. how much can be pushed to userspace while maintaininly good performance? - Why not make VTLs a first-party concept in KVM? E.g. rather than bury info in a VTL device and APIC ID groups, why not modify "struct kvm" to support replicating state that needs to be tracked per-VTL? Because of how memory attributes affect hugepages, duplicating *memslots* might actually be easier than teaching memslots to be VTL-aware. - Is "struct kvm_vcpu" the best representation of an execution context (if I'm getting the terminology right)? E.g. if 90% of the state is guaranteed to be identical for a given vCPU across execution contexts, then modeling that with separate kvm_vcpu structures is very inefficient. I highly doubt it's 90%, but it might be quite high depending on how much the TFLS restricts the state of the vCPU, e.g. if it's 64-bit only. The more info you can provide before LPC, the better, e.g. so that we can spend time discussing options instead of you getting peppered with questions about the requirements and whatnot.