On Tue, 2017-10-17 at 18:18 +0300, Leon Romanovsky wrote: > From: Noa Osherovich <noaos@xxxxxxxxxxxx> > > There are root complexes that are able to optimize their > performance when incoming data is multiple full cache lines. > > Scatter end padding is the device's ability to pad the ending of > incoming packets (scatter) I hate this naming. I'm sure people inside of Mellanox have gotten used to it, but this feature really has no bearing on scatter/gather at all. This is merely final write padding. You might have a scatter/gather list, you might have a single buffer. Either way, the PCI root complex couldn't care less about scatter/gather or not, it's all a byte stream to it. I would be much happier with a name that reflected what this really does. > to full cache line such that the last > upstream write generated by an incoming packet will be a full cache > line. > > Add a relevant entry to ib_device_cap_flags to report scatter end > padding capability of an RDMA device. > > Add the QP and WQ create flags with an entry for scatter end padding: > * A QP/WQ created with a scatter end padding flag will cause > HW to pad the last upstream write generated by a packet to cache > line. > > User should consider several factors before activating this feature: > - In case of high CPU memory load (which may cause PCI back pressure > in > turn), if a large percent of the writes are partial cache line, > this > feature should be checked as an optional solution. > - This feature might reduce performance if most packets are between > one > and two cache lines and PCIe throughput has reached its maximum > capacity. E.g. 65B packet from the network port will lead to 128B > write on PCIe, which may cause traffic on PCIe to reach high > throughput. > > Signed-off-by: Noa Osherovich <noaos@xxxxxxxxxxxx> > Reviewed-by: Majd Dibbiny <majd@xxxxxxxxxxxx> > Signed-off-by: Leon Romanovsky <leon@xxxxxxxxxx> > --- > drivers/infiniband/core/uverbs_cmd.c | 3 ++- > include/rdma/ib_verbs.h | 4 ++++ > 2 files changed, 6 insertions(+), 1 deletion(-) > > diff --git a/drivers/infiniband/core/uverbs_cmd.c > b/drivers/infiniband/core/uverbs_cmd.c > index d31e4bc58e9a..ab29a0327831 100644 > --- a/drivers/infiniband/core/uverbs_cmd.c > +++ b/drivers/infiniband/core/uverbs_cmd.c > @@ -1491,7 +1491,8 @@ static int create_qp(struct ib_uverbs_file > *file, > IB_QP_CREATE_MANAGED_RECV | > IB_QP_CREATE_SCATTER_FCS | > IB_QP_CREATE_CVLAN_STRIPPING | > - IB_QP_CREATE_SOURCE_QPN)) { > + IB_QP_CREATE_SOURCE_QPN | > + IB_QP_CREATE_SCATTER_END_PADDING)) { Maybe IB_QP_CREATE_PCI_WRITE_PAD? > ret = -EINVAL; > goto err_put; > } > diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h > index 9810e4568635..4c0a539cd2a2 100644 > --- a/include/rdma/ib_verbs.h > +++ b/include/rdma/ib_verbs.h > @@ -229,6 +229,8 @@ enum ib_device_cap_flags { > /* Deprecated. Please use IB_RAW_PACKET_CAP_SCATTER_FCS. */ > IB_DEVICE_RAW_SCATTER_FCS = (1ULL << 34), > IB_DEVICE_RDMA_NETDEV_OPA_VNIC = (1ULL << > 35), > + /* The device supports padding incoming writes to cacheline. > */ /* The device supports padding the final write of a PCI write transaction to a cacheline boundry so that the PCI root complex can optimize its memory accesses */ > + IB_DEVICE_SCATTER_END_PADDING = (1ULL << 36), > }; > > enum ib_signature_prot_cap { > @@ -1098,6 +1100,7 @@ enum ib_qp_create_flags { > IB_QP_CREATE_SCATTER_FCS = 1 << 8, > IB_QP_CREATE_CVLAN_STRIPPING = 1 << 9, > IB_QP_CREATE_SOURCE_QPN = 1 << 10, > + IB_QP_CREATE_SCATTER_END_PADDING = 1 << 11, > /* reserve bits 26-31 for low level drivers' internal use */ > IB_QP_CREATE_RESERVED_START = 1 << 26, > IB_QP_CREATE_RESERVED_END = 1 << 31, > @@ -1621,6 +1624,7 @@ enum ib_wq_flags { > IB_WQ_FLAGS_CVLAN_STRIPPING = 1 << 0, > IB_WQ_FLAGS_SCATTER_FCS = 1 << 1, > IB_WQ_FLAGS_DELAY_DROP = 1 << 2, > + IB_WQ_FLAGS_SCATTER_END_PADDING = 1 << 3, > }; > > struct ib_wq_init_attr { -- Doug Ledford <dledford@xxxxxxxxxx> GPG KeyID: B826A3330E572FDD Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html