On Thu, May 20, 2021 at 01:13:34PM +0300, Leon Romanovsky wrote: > From: Leon Romanovsky <leonro@xxxxxxxxxx> > > Changelog: > v1: > * Enabled by default RO in IB/core instead of changing all users > v0: https://lore.kernel.org/lkml/20210405052404.213889-1-leon@xxxxxxxxxx > > >From Avihai, > > Relaxed Ordering is a PCIe mechanism that relaxes the strict ordering > imposed on PCI transactions, and thus, can improve performance for > applications that can handle this lack of strict ordering. > > Currently, relaxed ordering can be set only by user space applications > for user MRs. Not all user space applications support relaxed ordering > and for this reason it was added as an optional capability that is > disabled by default. This behavior is not changed as part of this series, > and relaxed ordering remains disabled by default for user space. > > On the other hand, kernel users should universally support relaxed > ordering, as they are designed to read data only after observing the CQE > and use the DMA API correctly. There are a few platforms with broken > relaxed ordering implementation, but for them relaxed ordering is expected > to be turned off globally in the PCI level. In addition, note that this is > not the first use of relaxed ordering. Relaxed ordering has been enabled > by default in mlx5 ethernet driver, and user space apps use it as well for > quite a while. > > Hence, this series enabled relaxed ordering by default for kernel users so > they can benefit as well from the performance improvements. > > The following test results show the performance improvement achieved > with relaxed ordering. The test was performed by running FIO traffic > between a NVIDIA DGX A100 (ConnectX-6 NICs and AMD CPUs) and a NVMe > storage fabric, using NFSoRDMA: > > Without Relaxed Ordering: > READ: bw=16.5GiB/s (17.7GB/s), 16.5GiB/s-16.5GiB/s (17.7GB/s-17.7GB/s), > io=1987GiB (2133GB), run=120422-120422msec > > With relaxed ordering: > READ: bw=72.9GiB/s (78.2GB/s), 72.9GiB/s-72.9GiB/s (78.2GB/s-78.2GB/s), > io=2367GiB (2542GB), run=32492-32492msec > > The series has been tested over NVMe, iSER, SRP and NFS with ConnectX-6 > NIC. The tests included FIO verify and stress tests, and various > resiliency tests (shutting down NIC port in the middle of traffic, > rebooting the target in the middle of traffic etc.). There was such a big discussion on the last version I wondered why this was so quiet. I guess because the cc list isn't very big.. Adding the people from the original thread, here is the patches: https://lore.kernel.org/linux-rdma/cover.1621505111.git.leonro@xxxxxxxxxx/ I think this is the general approach that was asked for, to special case uverbs and turn it on in kernel universally Jason