> From: Alex Williamson <alex.williamson@xxxxxxxxxx> > Sent: Friday, May 24, 2024 6:48 AM > > On Thu, 23 May 2024 11:58:48 -0300 > Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > > > On Wed, May 22, 2024 at 11:40:58PM +0000, Tian, Kevin wrote: > > > > From: Jason Gunthorpe <jgg@xxxxxxxxxx> > > > > Sent: Thursday, May 23, 2024 7:32 AM > > > > > > > > On Wed, May 22, 2024 at 11:26:21PM +0000, Tian, Kevin wrote: > > > > > > From: Jason Gunthorpe <jgg@xxxxxxxxxx> > > > > > > Sent: Wednesday, May 22, 2024 8:30 PM > > > > > > > > > > > > On Wed, May 22, 2024 at 06:24:14AM +0000, Tian, Kevin wrote: > > > > > > > I'm fine to do a special check in the attach path to enable the flush > > > > > > > only for Intel GPU. > > > > > > > > > > > > We already effectively do this already by checking the domain > > > > > > capabilities. Only the Intel GPU will have a non-coherent domain. > > > > > > > > > > > > > > > > I'm confused. In earlier discussions you wanted to find a way to not > > > > > publish others due to the check of non-coherent domain, e.g. some > > > > > ARM SMMU cannot force snoop. > > > > > > > > > > Then you and Alex discussed the possibility of reducing pessimistic > > > > > flushes by virtualizing the PCI NOSNOOP bit. > > > > > > > > > > With that in mind I was thinking whether we explicitly enable this > > > > > flush only for Intel GPU instead of checking non-coherent domain > > > > > in the attach path, since it's the only device with such requirement. > > > > > > > > I am suggesting to do both checks: > > > > - If the iommu domain indicates it has force coherency then leave PCI > > > > no-snoop alone and no flush > > > > - If the PCI NOSNOOP bit is or can be 0 then no flush > > > > - Otherwise flush > > > > > > How to judge whether PCI NOSNOOP can be 0? If following PCI spec > > > it can always be set to 0 but then we break the requirement for Intel > > > GPU. If we explicitly exempt Intel GPU in 2nd check then what'd be > > > the value of doing that generic check? > > > > Non-PCI environments still have this problem, and the first check does > > help them since we don't have PCI config space there. > > > > PCI can supply more information (no snoop impossible) and variant > > drivers can add in too (want no snoop) > > I'm not sure I follow either. Since i915 doesn't set or test no-snoop > enable, I think we need to assume drivers expect the reset value, so a > device that supports no-snoop expects to use it, ie. we can't trap on > no-snoop enable being set, the device is more likely to just operate > with reduced performance if we surreptitiously clear the bit. > > The current proposal is to enable flushing based only on the domain > enforcement of coherency. I think the augmentation is therefore that > if the device is PCI and the no-snoop enable bit is zero after reset > (indicating hardwired to zero), we also don't need to flush. > > I'm not sure the polarity of the variant drive statement above is > correct. If the no-snoop enable bit is set after reset, we'd assume > no-snoop is possible, so the variant driver would only need a way to > indicate the device doesn't actually use no-snoop. For that it might > just virtualize the no-snoop enable setting to vfio-pci-core. Thanks, > Yeah. I re-checked the use of PCI_EXP_DEVCTL_NOSNOOP_EN and actually all references are about clearing the bit, echo'ing the point that if a driver wants to use nosnoop it expects the reset value w/o doing an explicit set and the virtualization of the no-snoop enable bit is more reasonable to catch the intention of 'clear'.