On 01/20/2014 10:18 AM, Peter Zijlstra wrote:
On Tue, Jan 14, 2014 at 11:44:06PM -0500, Waiman Long wrote:
This patch modifies the queue_write_unlock() function to use the new
smp_store_release() function (currently in tip). It also removes the
temporary implementation of smp_load_acquire() and smp_store_release()
function in qrwlock.c.
This patch will use atomic subtraction instead if the writer field is
not atomic.
Signed-off-by: Waiman Long<Waiman.Long@xxxxxx>
---
include/asm-generic/qrwlock.h | 10 ++++++----
kernel/locking/qrwlock.c | 34 ----------------------------------
2 files changed, 6 insertions(+), 38 deletions(-)
diff --git a/include/asm-generic/qrwlock.h b/include/asm-generic/qrwlock.h
index 5abb6ca..68f488b 100644
--- a/include/asm-generic/qrwlock.h
+++ b/include/asm-generic/qrwlock.h
@@ -181,11 +181,13 @@ static inline void queue_read_unlock(struct qrwlock *lock)
static inline void queue_write_unlock(struct qrwlock *lock)
{
/*
- * Make sure that none of the critical section will be leaked out.
+ * If the writer field is atomic, it can be cleared directly.
+ * Otherwise, an atomic subtraction will be used to clear it.
*/
- smp_mb__before_clear_bit();
- ACCESS_ONCE(lock->cnts.writer) = 0;
- smp_mb__after_clear_bit();
+ if (__native_word(lock->cnts.writer))
+ smp_store_release(&lock->cnts.writer, 0);
+ else
+ atomic_sub(_QW_LOCKED,&lock->cnts.rwa);
}
If we're a writer, read-count must be zero. The only way for that not to
be zero is a concurrent read-(try)lock. If you move all the
read-(try)locks over to cmpxchg() you can avoid this afaict:
That is not true. A reader may transiently set the reader count to a
non-zero value in the fast path. Also, a reader in interrupt context
will force a non-zero reader count to take the read lock as soon as the
writer is done.
static inline void queue_read_trylock(struct qrwlock *lock)
{
union qrwcnts cnts
cnts = ACCESS_ONCE(lock->cnts);
if (!cnts.writer) {
if (cmpxchg(&lock->cnts.rwc, cnts.rwc, cnts.rwc + _QR_BIAS) == cnts.rwc)
return 1;
}
return 0;
}
static inline void queue_read_lock(struct qrwlock *lock)
{
if (!queue_read_trylock(lock))
queue_read_lock_slowpath(); // XXX do not assume extra _QR_BIAS
}
At which point you have the guarantee that read-count == 0, and you can
write:
static inline void queue_write_unlock(struct qrwlock *lock)
{
smp_store_release(&lock->cnts.rwc, 0);
}
No?
The current code is optimized for the reader-heavy case. So I used xadd
for incrementing reader count to reduce the chance of retry due to
concurrent reader count updates. The downside is the need to back out if
a writer is here.
I can change the logic to use only cmpxchg for readers, but I don't see
a compelling reason to do so.
-Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html