On Wed, Apr 04, 2018 at 08:02:16PM -0700, Dennis Dalessandro wrote: > From: Sebastian Sanchez <sebastian.sanchez@xxxxxxxxx> > > The struct ib_wc uses two cache-lines per completion, and it is > unaligned. This structure used to fit within one cacheline, but it was > expanded by fields added in the following patches: > > dd5f03beb4f7 ("IB/core: Ethernet L2 attributes in verbs/cm structures") > c865f24628b9 ("IB/core: Add rdma_network_type to wc") > > These new fields are only needed for ethernet and for HCAs that don't > provide the network type to search the proper GID in the GID table. > Since there are two cache-lines, more cache-lines are dirtied per work > completion entry. > > Create a kernel only rvt_wc structure that is a single aligned > cache-line. This reduces the cache lines used per completion and > eliminates any cache line push-pull by aligning the size to a > cache-line. > > Cache-aligning the new kernel completion queue expands struct rvt_cq_wc > breaking the ABI for the user completion queue. Therefore, decouple the > kernel completion queue from struct rvt_cq_wc to prevent this. And why are none of the effected structs in a uapi/rdma include file? Please fix rvt in this regard. hfi and qib are the only remaining drivers in rdma-core that are not using kernel headers. It was too messed up for me to try even try to fix like I did for everyone else. > +/* > * This structure is used to contain the head pointer, tail pointer, > * and completion queue entries as a single memory allocation so > * it can be mmap'ed into user space. > @@ -68,11 +92,8 @@ > struct rvt_cq_wc { > u32 head; /* index of next entry to fill */ > u32 tail; /* index of next ib_poll_cq() entry */ > - union { > - /* these are actually size ibcq.cqe + 1 */ > - struct ib_uverbs_wc uqueue[0]; > - struct ib_wc kqueue[0]; > - }; > + /* this is actually size ibcq.cqe + 1 */ > + struct ib_uverbs_wc uqueue[0]; > }; I suppose this is in the wrong header and there is some open coded version of this in user space? oh lovely, we have this mess in user space: /* * This structure needs to have the same size and offsets as * the kernel's ib_wc structure since it is memory mapped. */ struct hfi1_wc { uint64_t wr_id; enum ibv_wc_status status; enum ibv_wc_opcode opcode; uint32_t vendor_err; uint32_t byte_len; uint32_t imm_data; /* in network byte order */ uint32_t qp_num; uint32_t src_qp; enum ibv_wc_flags wc_flags; uint16_t pkey_index; uint16_t slid; uint8_t sl; uint8_t dlid_path_bits; uint8_t port_num; }; struct hfi1_cq_wc { _Atomic(uint32_t) head; _Atomic(uint32_t) tail; struct hfi1_wc queue[1]; }; And hfi1_wc is really just ib_uverbs_wc and hfi1_cq_wc is the above header but not in a uapi header. Please fix all this in the kernel and rdma-core before sending patches to optimizing things. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html