Re: [PATCH v7 08/14] KVM: arm64: Enable KVM_CAP_MEMORY_FAULT_INFO and annotate fault in the stage-2 fault handler

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 04, 2024 at 12:32:51PM -0800, Sean Christopherson wrote:
> On Mon, Mar 04, 2024, Oliver Upton wrote:
> > On Mon, Mar 04, 2024 at 08:00:15PM +0000, Oliver Upton wrote:

[...]

> > Duh, kvm_vcpu_trap_is_exec_fault() (not to be confused with
> > kvm_vcpu_trap_is_iabt()) filters for S1PTW, so this *should*
> > shake out as a write fault on the stage-1 descriptor.
> > 
> > With that said, an architecture-neutral UAPI may not be able to capture
> > the nuance of a fault. This UAPI will become much more load-bearing in
> > the future, and the loss of granularity could become an issue.
> 
> What is the possible fallout from loss of granularity/nuance?  E.g. if the worst
> case scenario is that KVM may exit to userspace multiple times in order to resolve
> the problem, IMO that's an acceptable cost for having "dumb", common uAPI.
> 
> The intent/contract of the exit to userspace isn't for userspace to be able to
> completely understand what fault occurred, but rather for KVM to communicate what
> action userspace needs to take in order for KVM to make forward progress.

For one, the stage-2 page tables can describe permissions beyond RWX.
MTE tag allocation can be controlled at stage-2, which (confusingly)
desribes if the guest can insert tags in an opaque, physical space not
described by HPFAR.

There is a corresponding bit in ESR_EL2 that describes this at the time
of a fault, and R/W/X flags aren't enough to convey the right corrective
action.

> > Marc had some ideas about forwarding the register state to userspace
> > directly, which should be the right level of information for _any_ fault
> > taken to userspace.
> 
> I don't know enough about ARM to weigh in on that side of things, but for x86
> this definitely doesn't hold true.

We tend to directly model the CPU architecture wherever possible, as it
is the only way to create something intelligible. That same rationale
applies to a huge portion of KVM UAPI; it is architecture-dependent by
design.

> E.g. on the x86 side, KVM intentionally sets
> reserved bits in SPTEs for "caching" emulated MMIO accesses, and the resulting
> fault captures the "reserved bits set" information in register state.  But that's
> purely an (optional) imlementation detail of KVM that should never be exposed to
> userspace.

MMIO accesses would show up elsewhere though, right? If these magic
SPTEs were causing -EFAULT exits then something must've gone sideways.

Either way, I have no issues whatsoever if the direction for x86 is to
provide abstracted fault information.

-- 
Thanks,
Oliver




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux