This patch set improves upon the rwsem optimistic spinning patch set
from Davidlohr to enable better performing rwsem and more aggressive
use of optimistic spinning.
By using a microbenchmark running 1 million lock-unlock operations per
thread on a 4-socket 40-core Westmere-EX x86-64 test machine running
3.16-rc7 based kernels, the following table shows the execution times
with 2/10 threads running on different CPUs on the same socket where
load is the number of pause instructions in the critical section:
lock/r:w ratio # of threads Load:Execution Time (ms)
-------------- ------------ ------------------------
mutex 2 1:530.7, 5:406.0, 10:472.7
mutex 10 1:1848 , 5:2046 , 10:4394
Before patch:
rwsem/0:1 2 1:339.4, 5:368.9, 10:394.0
rwsem/1:1 2 1:2915 , 5:2621 , 10:2764
rwsem/10:1 2 1:891.2, 5:779.2, 10:827.2
rwsem/0:1 10 1:5618 , 5:5722 , 10:5683
rwsem/1:1 10 1:14562, 5:14561, 10:14770
rwsem/10:1 10 1:5914 , 5:5971 , 10:5912
After patch:
rwsem/0:1 2 1:161.1, 5:244.4, 10:271.4
rwsem/1:1 2 1:188.8, 5:212.4, 10:312.9
rwsem/10:1 2 1:168.8, 5:179.5, 10:209.8
rwsem/0:1 10 1:1306 , 5:1733 , 10:1998
rwsem/1:1 10 1:1512 , 5:1602 , 10:2093
rwsem/10:1 10 1:1267 , 5:1458 , 10:2233
% Change:
rwsem/0:1 2 1:-52.5%, 5:-33.7%, 10:-31.1%
rwsem/1:1 2 1:-93.5%, 5:-91.9%, 10:-88.7%
rwsem/10:1 2 1:-81.1%, 5:-77.0%, 10:-74.6%
rwsem/0:1 10 1:-76.8%, 5:-69.7%, 10:-64.8%
rwsem/1:1 10 1:-89.6%, 5:-89.0%, 10:-85.8%
rwsem/10:1 10 1:-78.6%, 5:-75.6%, 10:-62.2%