On Fri, Feb 26, 2021 at 09:44:41AM -0800, Sean Christopherson wrote: > +Will and Quentin (arm64) > > Moving the non-KVM x86 folks to bcc, I don't they care about KVM details at this > point. > > On Fri, Feb 26, 2021, Ashish Kalra wrote: > > On Thu, Feb 25, 2021 at 02:59:27PM -0800, Steve Rutherford wrote: > > > On Thu, Feb 25, 2021 at 12:20 PM Ashish Kalra <ashish.kalra@xxxxxxx> wrote: > > > Thanks for grabbing the data! > > > > > > I am fine with both paths. Sean has stated an explicit desire for > > > hypercall exiting, so I think that would be the current consensus. > > Yep, though it'd be good to get Paolo's input, too. > > > > If we want to do hypercall exiting, this should be in a follow-up > > > series where we implement something more generic, e.g. a hypercall > > > exiting bitmap or hypercall exit list. If we are taking the hypercall > > > exit route, we can drop the kvm side of the hypercall. > > I don't think this is a good candidate for arbitrary hypercall interception. Or > rather, I think hypercall interception should be an orthogonal implementation. > > The guest, including guest firmware, needs to be aware that the hypercall is > supported, and the ABI needs to be well-defined. Relying on userspace VMMs to > implement a common ABI is an unnecessary risk. > > We could make KVM's default behavior be a nop, i.e. have KVM enforce the ABI but > require further VMM intervention. But, I just don't see the point, it would > save only a few lines of code. It would also limit what KVM could do in the > future, e.g. if KVM wanted to do its own bookkeeping _and_ exit to userspace, > then mandatory interception would essentially make it impossible for KVM to do > bookkeeping while still honoring the interception request. > > However, I do think it would make sense to have the userspace exit be a generic > exit type. But hey, we already have the necessary ABI defined for that! It's > just not used anywhere. > > /* KVM_EXIT_HYPERCALL */ > struct { > __u64 nr; > __u64 args[6]; > __u64 ret; > __u32 longmode; > __u32 pad; > } hypercall; > > > > > Userspace could also handle the MSR using MSR filters (would need to > > > confirm that). Then userspace could also be in control of the cpuid bit. > > An MSR is not a great fit; it's x86 specific and limited to 64 bits of data. > The data limitation could be fudged by shoving data into non-standard GPRs, but > that will result in truly heinous guest code, and extensibility issues. > > The data limitation is a moot point, because the x86-only thing is a deal > breaker. arm64's pKVM work has a near-identical use case for a guest to share > memory with a host. I can't think of a clever way to avoid having to support > TDX's and SNP's hypervisor-agnostic variants, but we can at least not have > multiple KVM variants. > Potentially, there is another reason for in-kernel hypercall handling considering SEV-SNP. In case of SEV-SNP the RMP table tracks the state of each guest page, for instance pages in hypervisor state, i.e., pages with C=0 and pages in guest valid state with C=1. Now, there shouldn't be a need for page encryption status hypercalls on SEV-SNP as KVM can track & reference guest page status directly using the RMP table. As KVM maintains the RMP table, therefore we will need SET/GET type of interfaces to provide the guest page encryption status to userspace. For the above reason if we do in-kernel hypercall handling for page encryption status (which we probably won't require for SEV-SNP & correspondingly there will be no hypercall exiting), then we can implement a standard GET/SET ioctl interface to get/set the guest page encryption status for userspace, which will work across SEV, SEV-ES and SEV-SNP. Thanks, Ashish