Each time we access each of these arrays, even for a single index
we fetch a cacheline. Reduce cacheline bounces by fitting these members
in a cacheline aligned struct (swr_ctx) and allocate an array. Accessing
this array will fetch all of these members in a single shot.
Since the receive queue needs only the wrid we use a nameless union
where in the rwr_ctx we only have wrid member.
Have some performance numbers before/after this patch to support the
proposed change?
I didn't took the time to measure cache hit/miss. I just noticed it
a while ago and it's been bugging me for some time so I figured I'd
send it out...
Also, I have asked you the same question re the iser remote invalidation
series [1], this datais needed there too.
I didn't get to it, we had rough initial numbers, but we have yet to do
a full evaluation on different devices.
+/* Please don't let this exceed a single cacheline */
+struct swr_ctx {
+ u64 wrid;
+ u32 wr_data;
+ struct wr_list w_list;
+ u32 wqe_head;
+ u8 rsvd[12];
+}__packed;
what the role of the reserved field, is that for alignment purposes? if
yes, maybe better namewould be "align"
OK. I can change it.
Nit, (same) checkpatch error here and below
ERROR: space required after that close brace '}'
#111: FILE: drivers/infiniband/hw/mlx5/mlx5_ib.h:139:
+}__packed;
Will fix.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html