On Fri, Nov 10, 2023 at 09:09:33AM -0800, Sean Christopherson wrote: > On Wed, Nov 08, 2023, Yan Zhao wrote: > > On Tue, Nov 07, 2023 at 10:06:02AM -0800, Sean Christopherson wrote: > > > On Tue, Nov 07, 2023, Yan Zhao wrote: > > > > On Mon, Nov 06, 2023 at 02:34:08PM -0800, Sean Christopherson wrote: > > > > > On Wed, Nov 01, 2023, Yan Zhao wrote: > > > > > > On Tue, Oct 31, 2023 at 08:14:41AM -0700, Sean Christopherson wrote: > > > > > > > > > > If no #MC, could EPT type of guest RAM also be set to WB (without IPAT) even > > > > > > without non-coherent DMA? > > > > > > > > > > No, there are snooping/ordering issues on Intel, and to a lesser extent AMD. AMD's > > > > > WC+ solves the most straightfoward cases, e.g. WC+ snoops caches, and VMRUN and > > > > > #VMEXIT flush the WC buffers to ensure that guest writes are visible and #VMEXIT > > > > > (and vice versa). That may or may not be sufficient for multi-threaded use cases, > > > > > but I've no idea if there is actually anything to worry about on that front. I > > > > > think there's also a flaw with guest using UC, which IIUC doesn't snoop caches, > > > > > i.e. the guest could get stale data. > > > > > > > > > > AFAIK, Intel CPUs don't provide anything like WC+, so KVM would have to provide > > > > > something similar to safely let the guest control memtypes. Arguably, KVM should > > > > > have such mechansisms anyways, e.g. to make non-coherent DMA VMs more robust. > > > > > > > > > > But even then, there's still the question of why, i.e. what would be the benefit > > > > > of letting the guest control memtypes when it's not required for functional > > > > > correctness, and would that benefit outweight the cost. > > > > > > > > Ok, so for a coherent device , if it's assigned together with a non-coherent > > > > device, and if there's a page with host PAT = WB and guest PAT=UC, we need to > > > > ensure the host write is flushed before guest read/write and guest DMA though no > > > > need to worry about #MC, right? > > > > > > It's not even about devices, it applies to all non-MMIO memory, i.e. unless the > > > host forces UC for a given page, there's potential for WB vs. WC/UC issues. > > Do you think we can have KVM to expose an ioctl for QEMU to call in QEMU's > > invalidate_and_set_dirty() or in cpu_physical_memory_set_dirty_range()? > > > > In this ioctl, it can do nothing if non-coherent DMA is not attached and > > call clflush otherwise. > > Why add an ioctl()? Userspace can do CLFLUSH{OPT} directly. If it would fix a > real problem, then adding some way for userspace to query whether or not there > is non-coherent DMA would be reasonable, though that seems like something that > should be in VFIO (if it's not already there). Ah, right. I previously thought KVM can further remove the clflush when TDP is not enabled with an ioctl(). But it's not a real problem so far, as I didn't manage to devise a case to prove the WB vs WC/UC issues. (i.e. in my devised cases, even with guest memory mapped to WC, host still can get latest data with WB...) May come back later if it's proved to be a real issue in future. :)