On 7/5/2017 3:07 PM, Christoph Hellwig wrote:
On Tue, Jun 27, 2017 at 04:46:31PM -0400, chris hyser wrote:
I put this in for SPARC. In our case the host bridge/RC itself follows very
strict ordering unless the relaxed order bit is set in the TLP. This works
great for devices that actually allow the driver to enable it. We however
also have to support an infiniband card that does not support enabling this
in the HW and thus in the TLP but is actually fine with relaxed order for
the data buffers (ie the streaming I/O vs the coherent control buffers). In
fact w/o relaxed order the performance is absolutely atrocious ... w/
exceeds x86. This flag enables the driver to signal to us when we map the
buffer in the IOMMU to enable the relaxed order attribute for our HW.
We'll really need to start writing down our semantics. As I said
Fair enough. This will get documented.
given how our streaming dma mappings (dma_map_single/page/sg) are
defined I can't think of any way how relaxed vs strict ordering would
matter for them, so just enabling it by default seems like the right
thing in this case, instead of having to patch every PCIe driver people
People get real squeamish about just turning on relaxed order for driver/devices we actually know something about and
test let alone everything. I do agree with your interpretation of the streaming DMA semantics. That does not mean every
driver writer did or does. :-)
might ever use on sparc to work around your bridge.
To be clear, this isn't really a host bridge issue. Technically no driver that runs on SPARC needs this enabled -- even
performance critical ... if they do the HW correctly. Again, we have a necessary performance critical device which does
not correctly signal its okayness with relaxed order for non-control DMA. This feature of the host bridge is to enable a
workaround for just these cases (deemed needed by past history). Said differently, lots of devices are made for the
commodity market where shortcuts like this apparently do not affect performance but are critical to performance on large
systems.
-chrish