On 1/12/2016 4:44 PM, Sagi Grimberg wrote:
Each time we access each of these arrays, even for a single index
we fetch a cacheline. Reduce cacheline bounces by fitting these members
in a cacheline aligned struct (swr_ctx) and allocate an array.
Accessing
this array will fetch all of these members in a single shot.
Since the receive queue needs only the wrid we use a nameless union
where in the rwr_ctx we only have wrid member.
Have some performance numbers before/after this patch to support the
proposed change?
I didn't took the time to measure cache hit/miss. I just noticed it
a while ago and it's been bugging me for some time so I figured I'd
send it out...
The thing is that for data-path changes on high performance network
drivers, we @ least need to know that the perf is as good as it was
before the change. So you could run your iser perf IOPS test
before/after the change and post 1-2 lines with results as part of the
change-log.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html