On Thu, Jul 09, 2020 at 10:35:50AM -0700, Jonathan Lemon wrote: > On Wed, Jul 08, 2020 at 08:26:02PM -0300, Jason Gunthorpe wrote: > > On Wed, Jul 08, 2020 at 06:16:30PM -0500, Bjorn Helgaas wrote: > > > I suspect there may be device-specific controls, too, because [1] > > > claims to enable/disable Relaxed Ordering but doesn't touch the > > > PCIe Device Control register. Device-specific controls are > > > certainly allowed, but of course it would be up to the driver, and > > > the device cannot generate TLPs with Relaxed Ordering unless the > > > architected PCIe Enable Relaxed Ordering bit is *also* set. > > > > Yes, at least on RDMA relaxed ordering can be set on a per transaction > > basis and is something userspace can choose to use or not at a fine > > granularity. This is because we have to support historical > > applications that make assumptions that data arrives in certain > > orders. > > > > I've been thinking of doing the same as this patch but for RDMA kernel > > ULPs and just globally turn it on if the PCI CAP is enabled as none of > > our in-kernel uses have the legacy data ordering problem. > > If I'm following this correctly - there are two different controls being > discussed here: > > 1) having the driver request PCI relaxed ordering, which may or may > not be granted, based on other system settings, and This is what Bjorn was thinking about, yes, it is some PCI layer function to control the global config space bit. > 2) having the driver set RO on the transactions it initiates, which > are honored iff the PCI bit is set. > > It seems that in addition to the PCI core changes, there still is a need > for driver controls? Unless the driver always enables RO if it's capable? I think the PCI spec imagined that when the config space RO bit was enabled the PCI device would just start using RO packets, in an appropriate and device specific way. So the fine grained control in #2 is something done extra by some devices. IMHO if the driver knows it is functionally correct with RO then it should enable it fully on the device when the config space bit is set. I'm not sure there is a reason to allow users to finely tune RO, at least I haven't heard of cases where RO is a degredation depending on workload. If some platform doesn't work when RO is turned on then it should be globally black listed like is already done in some cases. If the devices has bugs and uses RO wrong, or the driver has bugs and is only stable with !RO and Intel, then the driver shouldn't turn it on at all. In all of these cases it is not a user tunable. Development and testing reasons, like 'is my crash from a RO bug?' to tune should be met by the device global setpci, I think. Jason