On 06/09/2015 08:04 AM, Peter Zijlstra wrote:
On Mon, Jun 08, 2015 at 06:20:44PM -0400, Waiman Long wrote:
The current cmpxchg() loop in setting the _QW_WAITING flag for writers
in queue_write_lock_slowpath() will contend with incoming readers
causing possibly extra cmpxchg() operations that are wasteful. This
patch changes the code to do a byte cmpxchg() to eliminate contention
with new readers.
This is very narrow, would not the main cost still be the cacheline
transfers?
Do you have any numbers to back this? I would feel much better about
this if there's real numbers attached.
I have just sent out a v2 patch with the microbenchmark data for the 2nd
patch. The extra cmpxchg() because of reader contention should have
about the same cost of a cacheline miss. The performance gain depends on
how often this kind of reader contention happens.
Regards,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html