The approach I've taken so far in adding support for SPE in KVM [1] relies on pinning the entire VM memory to avoid SPE triggering stage 2 faults altogether. I've taken this approach because: 1. SPE reports the guest VA on an stage 2 fault, similar to stage 1 faults, and at the moment KVM has no way to resolve the VA to IPA translation. The AT instruction is not useful here, because PAR_EL1 doesn't report the IPA in the case of a stage 2 fault on a stage 1 translation table walk. 2. The stage 2 fault is reported asynchronously via an interrupt, which means there will be a window where profiling is stopped from the moment SPE triggers the fault and when the PE taks the interrupt. This blackout window is obviously not present when running on bare metal, as there is no second stage of address translation being performed. I've been thinking about this approach and I was considering translating the VA reported by SPE to the IPA instead, thus treating the SPE stage 2 data aborts more like regular (MMU) data aborts. As I see it, this approach has several merits over memory pinning: - The stage 1 translation table walker is also needed for nested virtualization, to emulate AT S1* instructions executed by the L1 guest hypervisor. - Walking the guest's translation tables is less of a departure from the way KVM manages physical memory for a virtual machine today. I had a discussion with Mark offline about this approach and he expressed a very sensible concern: when a guest is profiling, there is a blackout window where profiling is stopped which doesn't happen on bare metal (point 2 above). My questions are: 1. Is having this blackout window, regardless of its size, unnacceptable? If it is, then I'll continue with the memory pinning approach. 2. If having a blackout window is acceptable, how large can this window be before it becomes too much? I can try to take some performance measurements to evaluate the blackout window when using a stage 1 walker in relation to the buffer write speed on different hardware. I have access to an N1SDP machine and an Ampere Altra for this. [1] https://lore.kernel.org/all/20211117153842.302159-1-alexandru.elisei@xxxxxxx/ Thanks, Alex _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm