On Thu, 3 Jun 2021 09:40:36 -0300 Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > On Thu, Jun 03, 2021 at 03:22:27AM +0000, Tian, Kevin wrote: > > > From: Alex Williamson <alex.williamson@xxxxxxxxxx> > > > Sent: Thursday, June 3, 2021 10:51 AM > > > > > > On Wed, 2 Jun 2021 19:45:36 -0300 > > > Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > > > > > > > On Wed, Jun 02, 2021 at 02:37:34PM -0600, Alex Williamson wrote: > > > > > > > > > Right. I don't follow where you're jumping to relaying DMA_PTE_SNP > > > > > from the guest page table... what page table? > > > > > > > > I see my confusion now, the phrasing in your earlier remark led me > > > > think this was about allowing the no-snoop performance enhancement in > > > > some restricted way. > > > > > > > > It is really about blocking no-snoop 100% of the time and then > > > > disabling the dangerous wbinvd when the block is successful. > > > > > > > > Didn't closely read the kvm code :\ > > > > > > > > If it was about allowing the optimization then I'd expect the guest to > > > > enable no-snoopable regions via it's vIOMMU and realize them to the > > > > hypervisor and plumb the whole thing through. Hence my remark about > > > > the guest page tables.. > > > > > > > > So really the test is just 'were we able to block it' ? > > > > > > Yup. Do we really still consider that there's some performance benefit > > > to be had by enabling a device to use no-snoop? This seems largely a > > > legacy thing. > > > > Yes, there is indeed performance benefit for device to use no-snoop, > > e.g. 8K display and some imaging processing path, etc. The problem is > > that the IOMMU for such devices is typically a different one from the > > default IOMMU for most devices. This special IOMMU may not have > > the ability of enforcing snoop on no-snoop PCI traffic then this fact > > must be understood by KVM to do proper mtrr/pat/wbinvd virtualization > > for such devices to work correctly. > > Or stated another way: > > We in Linux don't have a way to control if the VFIO IO page table will > be snoop or no snoop from userspace so Intel has forced the platform's > IOMMU path for the integrated GPU to be unable to enforce snoop, thus > "solving" the problem. That's giving vfio a lot of credit for influencing VT-d design. > I don't think that is sustainable in the oveall ecosystem though. Our current behavior is a reasonable default IMO, but I agree more control will probably benefit us in the long run. > 'qemu --allow-no-snoop' makes more sense to me I'd be tempted to attach it to the -device vfio-pci option, it's specific drivers for specific devices that are going to want this and those devices may not be permanently attached to the VM. But I see in the other thread you're trying to optimize IOMMU page table sharing. There's a usability question in either case though and I'm not sure how to get around it other than QEMU or the kernel knowing a list of devices (explicit IDs or vendor+class) to select per device defaults. > > When discussing I/O page fault support in another thread, the consensus > > is that an device handle will be registered (by user) or allocated (return > > to user) in /dev/ioasid when binding the device to ioasid fd. From this > > angle we can register {ioasid_fd, device_handle} to KVM and then call > > something like ioasidfd_device_is_coherent() to get the property. > > Anyway the coherency is a per-device property which is not changed > > by how many I/O page tables are attached to it. > > It is not device specific, it is driver specific > > As I said before, the question is if the IOASID itself can enforce > snoop, or not. AND if the device will issue no-snoop or not. > > Devices that are hard wired to never issue no-snoop are safe even with > an IOASID that cannot enforce snoop. AFAIK really only GPUs use this > feature. Eg I would be comfortable to say mlx5 never uses the no-snoop > TLP flag. > > Only the vfio_driver could know this. Could you clarify "vfio_driver"? The existing vfio-pci driver can't know this, beyond perhaps probing if the Enable No-snoop bit is hardwired to zero. It's the driver running on top of vfio that ultimately controls whether a capable device actually issues no-snoop TLPs, but that can't be known to us. A vendor variant of vfio-pci might certainly know more about how its device is used by those userspace/VM drivers. Thanks, Alex