On 06/15/2016 01:22 PM, Peter Zijlstra wrote:
On Tue, Jun 14, 2016 at 06:48:05PM -0400, Waiman Long wrote:
Currently, when down_read() fails, the active read locking isn't undone
until the rwsem_down_read_failed() function grabs the wait_lock. If the
wait_lock is contended, it may takes a while to get the lock. During
that period, writer lock stealing will be disabled because of the
active read lock.
This patch will release the active read lock ASAP so that writer lock
stealing can happen sooner. The only downside is when the reader is
the first one in the wait queue as it has to issue another atomic
operation to update the count.
On a 4-socket Haswell machine running on a 4.7-rc1 tip-based kernel,
the fio test with multithreaded randrw and randwrite tests on the
same file on a XFS partition on top of a NVDIMM with DAX were run,
the aggregated bandwidths before and after the patch were as follows:
Test BW before patch BW after patch % change
---- --------------- -------------- --------
randrw 1210 MB/s 1352 MB/s +12%
randwrite 1622 MB/s 1710 MB/s +5.4%
The write-only microbench also showed improvement because some read
locking was done by the XFS code.
How does a reader only micro-bench react? I'm thinking the extra atomic
might hurt a bit.
A reader only benchmark will not go into the slow path at all. It is
only when there is a mix of readers and writers will the reader slowpath
be executed.
I think there will be a little bit of performance impact for a workload
that produce just the right amount of rwsem contentions. However, it is
hard to produce a microbenchmark to create such a right amount of
contention. As the amount of contention increases, I believe this patch
will help performance instead of hurting it. Even then, the amount of
performance degradation in that particular case will be pretty small.
Cheers,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-alpha" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html