> On 2.6.39, the contention of anon_vma->lock occupies 3.25% of cpu. > However, after the switch of the lock to mutex on 3.0-rc2, the mutex > acquisition jumps to 18.6% of cpu. This seems to be the main cause of > the 52% throughput regression. > This patch makes the mutex in Tim's workload take a bit less CPU time (4% down) but it doesn't really fix the regression. When spinning for a value it's always better to read it first before attempting to write it. This saves expensive operations on the interconnect. So it's not really a fix for this, but may be a slight improvement for other workloads. -Andi >From 34d4c1e579b3dfbc9a01967185835f5829bd52f0 Mon Sep 17 00:00:00 2001 From: Andi Kleen <ak@xxxxxxxxxxxxxxx> Date: Tue, 14 Jun 2011 16:27:54 -0700 Subject: [PATCH] mutex: while spinning read count before attempting cmpxchg Under heavy contention it's better to read first before trying to do an atomic operation on the interconnect. This gives a few percent improvement for the mutex CPU time under heavy contention and likely saves some power too. Signed-off-by: Andi Kleen <ak@xxxxxxxxxxxxxxx> --- kernel/mutex.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/kernel/mutex.c b/kernel/mutex.c index d607ed5..1abffa9 100644 --- a/kernel/mutex.c +++ b/kernel/mutex.c @@ -170,7 +170,8 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, if (owner && !mutex_spin_on_owner(lock, owner)) break; - if (atomic_cmpxchg(&lock->count, 1, 0) == 1) { + if (atomic_read(&lock->count) == 1 && + atomic_cmpxchg(&lock->count, 1, 0) == 1) { lock_acquired(&lock->dep_map, ip); mutex_set_owner(lock); preempt_enable(); -- 1.7.4.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>