On Fri, May 15, 2020 at 04:33:52PM -0400, Vivek Goyal wrote: > On Fri, May 15, 2020 at 09:18:07PM +0200, Paolo Bonzini wrote: > > On 15/05/20 20:46, Sean Christopherson wrote: > > >> The new one using #VE is not coming very soon (we need to emulate it for > > >> <Broadwell and AMD processors, so it's not entirely trivial) so we are > > >> going to keep "page not ready" delivery using #PF for some time or even > > >> forever. However, page ready notification as #PF is going away for good. > > > > > > And isn't hardware based EPT Violation #VE going to require a completely > > > different protocol than what is implemented today? For hardware based #VE, > > > KVM won't intercept the fault, i.e. the guest will need to make an explicit > > > hypercall to request the page. > > > > Yes, but it's a fairly simple hypercall to implement. > > > > >> That said, type1/type2 is quite bad. :) Let's change that to page not > > >> present / page ready. > > > > > > Why even bother using 'struct kvm_vcpu_pv_apf_data' for the #PF case? VMX > > > only requires error_code[31:16]==0 and SVM doesn't vet it at all, i.e. we > > > can (ab)use the error code to indicate an async #PF by setting it to an > > > impossible value, e.g. 0xaaaa (a is for async!). That partciular error code > > > is even enforced by the SDM, which states: > > > > Possibly, but it's water under the bridge now. > > And the #PF mechanism also has the problem with NMIs that happen before > > the error code is read > > and page faults happening in the handler (you may connect some dots now). > > I understood that following was racy. > > do_async_page_fault <--- kvm injected async page fault > NMI happens (Before kvm_read_and_reset_pf_reason() is done) > ->do_async_page_fault() (This is regular page fault but it will read > reason from shared area and will treat itself > as async page fault) > > So this is racy. > > But if we get rid of the notion of reading from shared region in page > fault handler, will we not get rid of this race. > > I am assuming that error_code is not racy as it is pushed on stack. > What am I missing. Nothing, AFAICT. As I mentioned in a different mail, CR2 can be squished, but I don't see how error code can be lost. But, because CR2 can be squished, there still needs to be an in-memory busy flag even if error code is used as the host #PF indicator, otherwise the guest could lose one of the tokens.