Re: REGRESSION: Performance regressions from switching anon_vma->lock to mutex

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Andi Kleen <ak@xxxxxxxxxxxxxxx> wrote:

> > On 2.6.39, the contention of anon_vma->lock occupies 3.25% of cpu.
> > However, after the switch of the lock to mutex on 3.0-rc2, the mutex
> > acquisition jumps to 18.6% of cpu.  This seems to be the main cause of
> > the 52% throughput regression.
> > 
> This patch makes the mutex in Tim's workload take a bit less CPU time
> (4% down) but it doesn't really fix the regression. When spinning for a 
> value it's always better to read it first before attempting to write it.
> This saves expensive operations on the interconnect.
> 
> So it's not really a fix for this, but may be a slight improvement for 
> other workloads.
> 
> -Andi
> 
> >From 34d4c1e579b3dfbc9a01967185835f5829bd52f0 Mon Sep 17 00:00:00 2001
> From: Andi Kleen <ak@xxxxxxxxxxxxxxx>
> Date: Tue, 14 Jun 2011 16:27:54 -0700
> Subject: [PATCH] mutex: while spinning read count before attempting cmpxchg
> 
> Under heavy contention it's better to read first before trying to 
> do an atomic operation on the interconnect.
> 
> This gives a few percent improvement for the mutex CPU time under 
> heavy contention and likely saves some power too.
> 
> Signed-off-by: Andi Kleen <ak@xxxxxxxxxxxxxxx>
> ---
>  kernel/mutex.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/kernel/mutex.c b/kernel/mutex.c
> index d607ed5..1abffa9 100644
> --- a/kernel/mutex.c
> +++ b/kernel/mutex.c
> @@ -170,7 +170,8 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>  		if (owner && !mutex_spin_on_owner(lock, owner))
>  			break;
>  
> -		if (atomic_cmpxchg(&lock->count, 1, 0) == 1) {
> +		if (atomic_read(&lock->count) == 1 && 
> +		    atomic_cmpxchg(&lock->count, 1, 0) == 1) {
>  			lock_acquired(&lock->dep_map, ip);
>  			mutex_set_owner(lock);
>  			preempt_enable();


What is *sorely* missing from your changelog (again) is the 
explanation of *why* it improves performance in the contended case: 
because the cacheline is brought into shared-read MESI state which 
the CMPXCHG might not dirty if the CMPXCHG fails in the contended 
case.

Firstly, this reduces the cost of hard bounces somewhat because the 
owner CPU still has the cacheline in read-shared state, which it can 
invalidate from the other CPU's cache on unlock in a bit cheaper way 
if it were forced to bounce the cacheline back.

Secondly, in the contended case more that 2 CPUs are looping on the 
same cacheline and bringing it in shared and doing the cmpxchg loop 
will not bounce the cacheline around (if CMPXCHG is smart enough to 
not dirty the cacheline even in the failed case - this is CPU model 
dependent) - it will spread to all interested CPUs in read-shared 
state. This is most likely the dominant factor in Tim's tests.

Had you considered and described all that then you'd inevitably have 
been led to consider the much more important issue that is missing 
from your patch: the analysis of what happens to the cacheline under 
*light* contention, which is by far the most important case ...

In the lightly contended case it's ultimately bad to bring in the 
cacheline as read-shared first, because the cmpxchg will have to go 
out to the MESI interconnect *again*: this time to flush the 
cacheline from the previous owner CPU's cache.

So i don't think we want your patch, without some really good 
supporting facts and analysis that show that the lightly contended 
case does not suffer.

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]