On Thu, 2023-05-04 at 08:34 -0700, Sean Christopherson wrote: > On Wed, May 03, 2023, Kai Huang wrote: > > > for better or worse, KVM doesn't apply the "zap > > > SPTEs" logic to guest PAT changes when the VM has a passthrough device > > > with non-coherent DMA. > > > > Is it a bug? > > No. KVM's MTRR behavior is using a heuristic to try not to break the VM: if the > VM has non-coherent DMA, then honor UC mapping in the MTRRs as such mappings may > be coverage the non-coherent DMA. > > From vmx_get_mt_mask(): > > /* We wanted to honor guest CD/MTRR/PAT, but doing so could result in > * memory aliases with conflicting memory types and sometimes MCEs. > * We have to be careful as to what are honored and when. > > The PAT is problematic because it is referenced via the guest PTEs, versus the > MTRRs being tied to the guest physical address, e.g. different virtual mappings > for the same physical address can yield different memtypes via the PAT. My head > hurts just thinking about how that might interact with shadow paging :-) > > Even the MTRRs are somewhat sketchy because they are technically per-CPU, i.e. > two vCPUs could have different memtypes for the same physical address. But in > practice, sane software/firmware uses consistent MTRRs across all CPUs. Agreed on all above odds. But I think the answer to my question is actually we simply don't _need_ to zap SPTEs (with non-coherent DMA) when guest's IA32_PAT is changed: 1) If EPT is enabled, IIUC guest's PAT is already horned. VMCS's GUEST_IA32_PAT always reflects the IA32_PAT that guest wants to set. EPT's memtype bits are set according to guest's MTRR. That means guest changing IA32_PAT doesn't need to zap EPT PTEs as "EPT PTEs essentially only replaces guest's MTRRs". 2) If EPT is disabled, looking at the code, if I read correctly, the 'shadow_memtype_mask' is 0 for Intel, in which case KVM won't try to set any PAT memtype bit in shadow MMU PTE, which means the true PAT memtype is always WB and guest's memtype is never horned (guest's MTRRs are also never actually used by HW), which should be fine I guess?? My brain refused to go further :) But anyway back to my question, I think "changing guest's IA32_PAT" shouldn't result in needing to "zap SPTEs".