On Mon, Sep 10, 2018 at 09:49:19AM -0700, Dennis Dalessandro wrote: > From: Sebastian Sanchez <sebastian.sanchez@xxxxxxxxx> > > The struct ib_wc uses two cache-lines per completion, and it is > unaligned. This structure used to fit within one cacheline, but it was > expanded by fields added in the following patches: Like Parav says, that statement seems to be nonsense: struct ib_wc { union { u64 wr_id; /* 8 */ struct ib_cqe * wr_cqe; /* 8 */ }; /* 0 8 */ enum ib_wc_status status; /* 8 4 */ enum ib_wc_opcode opcode; /* 12 4 */ u32 vendor_err; /* 16 4 */ u32 byte_len; /* 20 4 */ struct ib_qp * qp; /* 24 8 */ union { __be32 imm_data; /* 4 */ u32 invalidate_rkey; /* 4 */ } ex; /* 32 4 */ u32 src_qp; /* 36 4 */ u32 slid; /* 40 4 */ int wc_flags; /* 44 4 */ u16 pkey_index; /* 48 2 */ u8 sl; /* 50 1 */ u8 dlid_path_bits; /* 51 1 */ u8 port_num; /* 52 1 */ u8 smac[6]; /* 53 6 */ /* XXX 1 byte hole, try to pack */ u16 vlan_id; /* 60 2 */ u8 network_hdr_type; /* 62 1 */ /* size: 64, cachelines: 1, members: 17 */ /* sum members: 62, holes: 1, sum holes: 1 */ /* padding: 1 */ }; > Create a kernel only rvt_wc structure that is a single aligned > cache-line. This reduces the cache lines used per completion and > eliminates any cache line push-pull by aligning the size to a > cache-line. Not at all sure this is even a good idea to cache align. Most of the usages here are singletons on-stack and we can resonably expect the stack to be hot in the cache. Wasting stack space sounds like a performance negative.. So not taking this, resend with an accurate commit message and some performance numbers to try again.. Jason