On Thu, Mar 07, 2024, Kai Huang wrote: > > > On 6/03/2024 3:02 pm, Sean Christopherson wrote: > > On Wed, Mar 06, 2024, Kai Huang wrote: > > > > > > > > > On 6/03/2024 1:38 pm, Sean Christopherson wrote: > > > > On Wed, Mar 06, 2024, Kai Huang wrote: > > > > > > > > > > > > > > > On 28/02/2024 3:41 pm, Sean Christopherson wrote: > > > > > > Prioritize private vs. shared gfn attribute checks above slot validity > > > > > > checks to ensure a consistent userspace ABI. E.g. as is, KVM will exit to > > > > > > userspace if there is no memslot, but emulate accesses to the APIC access > > > > > > page even if the attributes mismatch. > > > > > > > > > > IMHO, it would be helpful to explicitly say that, in the later case (emulate > > > > > APIC access page) we still want to report MEMORY_FAULT error first (so that > > > > > userspace can have chance to fixup, IIUC) instead of emulating directly, > > > > > which will unlikely work. > > > > > > > > Hmm, it's not so much that emulating directly won't work, it's that KVM would be > > > > violating its ABI. Emulating APIC accesses after userspace converted the APIC > > > > gfn to private would still work (I think), but KVM's ABI is that emulated MMIO > > > > is shared-only. > > > > > > But for (at least) TDX guest I recall we _CAN_ allow guest's MMIO to be > > > mapped as private, right? The guest is supposed to get a #VE anyway? > > > > Not really. KVM can't _map_ emulated MMIO as private memory, because S-EPT > > entries can only point at convertible memory. > > Right. I was talking about the MMIO mapping in the guest, which can be > private I suppose. > > > KVM _could_ emulate in response to a !PRESENT EPT violation, but KVM is not > > going to do that. > > > > https://lore.kernel.org/all/ZcUO5sFEAIH68JIA@xxxxxxxxxx > > > > Right agreed KVM shouldn't "emulate" !PRESENT fault. One clarification. KVM *does* emulate !PRESENT faults. And that's not optional, as caching MMIO info in SPTEs is purely an optimization and isn't possible on all CPUs, e.g. AMD CPUs with MAXPHYADDR=52 don't have reserved bits to set. My point above was specifically about emulating *private* !PRESENT faults as MMIO. > I am talking about this -- for TDX guest, if I recall correctly, for guest's > MMIO gfn KVM still needs to get the EPT violation for the _first_ access, in > which KVM can configure the EPT entry to make sure "suppress #VE" bit is > cleared so the later accesses can trigger #VE directly. That's totally fine, so long as the access is shared, not private. > I suppose this is still the way we want to implement? > > I am afraid for this case, we will still see the MMIO GFN as private, while > by default I believe the "guest memory attributes" for that MMIO GFN should > be shared? No, the guest should know it's (emulated) MMIO and access the gfn as shared. That might generate a !PRESENT fault, but that's again a-ok. > AFAICT, it seems the "guest memory attributes" code doesn't check whether a > GFN is MMIO or truly RAM. That's userspace's responsibility, not KVM's responsibility. And if userspace doesn't proactively make emulated MMIO regions shared, then the memory_fault exit that results from this patch will give userspace the hook it needs to convert the gfn to shared on-demand.