Jan Kiszka <jan.kiszka@xxxxxx> writes: > On 2014-03-26 21:22, Bandan Das wrote: >> Jan Kiszka <jan.kiszka@xxxxxx> writes: >> >>> On 2014-03-22 17:43, Bandan Das wrote: >>>> Jan Kiszka <jan.kiszka@xxxxxx> writes: >>>> >>>>> On 2014-03-20 21:58, Bandan Das wrote: >>>>>> Jan Kiszka <jan.kiszka@xxxxxxxxxxx> writes: >>>>>> >>>>>>> On 2014-03-20 04:28, Bandan Das wrote: >>>>>>>> Some L1 hypervisors such as Xen seem to be calling invept after >>>>>>>> vmclear or before vmptrld on L2. In this case, proceed with >>>>>>>> falling through and syncing roots as a case where >>>>>>>> context wide invalidation can't be supported >>>>>>> >>>>>>> Can we also base this behaviour on a statement in the SDM? But on first >>>>>>> glance, I do not find anything like this over there. >>>>>> >>>>>> The SDM has nothing of this sort explicitly mentioned but 28.3.3.1 >>>>>> "Operations that invalidate Cached Mappings" does mention that >>>>>> the instruction may invalidate mappings associated with other >>>>>> EP4TAs (even in single context). >>>>> >>>>> Yes, "may". So we are implementing undefined behavior in order to please >>>>> a broken hypervisor that relies on it? Then please state this in the >>>>> patch and probably also inform Xen about their issue. >>>> >>>> Why undefined behavior ? We don't do anything specific for >>>> the single context invalidation case ianyway .e If the eptp matches what >>>> vmcs12 has, single context invalidation does fall though to the global >>>> invalidation case already. All this change does is add the "L1 calls >>>> invept after vmclear and before vmptrld" to the list of cases to fall >>>> though to global invalidation since nvmx doesn't have any knowledge of >>>> the current eptp for this case. >>> >>> OK, I think I misunderstood what the guest expects and how we currently >>> achieve this: we do not track the mapping between guest and host eptp, >>> thus cannot properly emulate its behaviour. We therefore need to flush >>> everything. >>> >>>> >>>> Or do you think we should rethink this approach ? >>> >>> Well, I wonder if we should expose single-context invept support at all. >>> >>> I'm also wondering if we are returning proper flags on >>> >>> if ((operand.eptp & eptp_mask) != >>> (nested_ept_get_cr3(vcpu) & eptp_mask)) >>> break; >> >> That does sound plausible but then we would have to get rid of this >> little optimization in the code you have quoted above. We would have >> to flush and sync roots for all invept calls. So, my preference is to >> keep it. > > Do you have any numbers on how many per-context invept calls we are able > to ignore? At least for KVM this is should be 0 as we only flush on > changes to the current EPT structures. Just instrumented a Debian L2 bootup under L1 Xen and there were 0 per-context invept calls (with a valid vmcs12). I will spin out a version for testing where we don't advertise single context invept and if everything works, put a comment as to why we are removing support. Thanks, Bandan > Jan > >> >>> Neither nested_vmx_succeed nor nested_vmx_fail* is called if this >>> condition evaluates to true. >> >> It appears we should always call nested_vmx_succeed(). If you are ok, >> I will send a v2. >> >> Thanks, >> Bandan >> >>> Jan -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html