Re: [PATCH -v6][RFC]: mutex: implement adaptive spinning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Wed, 2009-01-07 at 15:10 -0800, Linus Torvalds wrote:
> 
> > Please take all my patches to be pseudo-code. They've neither been 
> > compiled nor tested, and I'm just posting them in the hope that somebody 
> > else will then do things in the direction I think is the proper one ;)
> 
> Linux opteron 2.6.28-tip #585 SMP PREEMPT Thu Jan 8 10:38:09 CET 2009 x86_64 x86_64 x86_64 GNU/Linux
> 
> [root@opteron bench]# echo NO_OWNER_SPIN > /debug/sched_features; ./timec -e -5,-4,-3,-2 ./test-mutex V 16 10
> 2 CPUs, running 16 parallel test-tasks.
> checking VFS performance.
> avg ops/sec:               74996
> 
>  Performance counter stats for './test-mutex':
> 
>    12098.324578  task clock ticks     (msecs)
> 
>            1081  CPU migrations       (events)
>            7102  context switches     (events)
>            2763  pagefaults           (events)
> 
>  Wall-clock time elapsed: 12026.804839 msecs
> 
> [root@opteron bench]# echo OWNER_SPIN > /debug/sched_features; ./timec -e -5,-4,-3,-2 ./test-mutex V 16 10
> 2 CPUs, running 16 parallel test-tasks.
> checking VFS performance.
> avg ops/sec:               228126
> 
>  Performance counter stats for './test-mutex':
> 
>    22280.283224  task clock ticks     (msecs)
> 
>             117  CPU migrations       (events)
>            5711  context switches     (events)
>            2781  pagefaults           (events)
> 
>  Wall-clock time elapsed: 12307.053737 msecs
> 
> * WOW *

WOW indeed - and i can see a similar _brutal_ speedup on two separate 
16-way boxes as well:

  16 CPUs, running 128 parallel test-tasks.

  NO_OWNER_SPIN:
  avg ops/sec:               281595

  OWNER_SPIN:
  avg ops/sec:               524791

Da Killer!

Look at the performance counter stats:

>    12098.324578  task clock ticks     (msecs)
> 
>            1081  CPU migrations       (events)
>            7102  context switches     (events)
>            2763  pagefaults           (events)

>    22280.283224  task clock ticks     (msecs)
> 
>             117  CPU migrations       (events)
>            5711  context switches     (events)
>            2781  pagefaults           (events)

We were able to spend twice as much CPU time and efficiently so - and we 
did about 10% of the cross-CPU migrations as before (!).

My (wild) guess is that the biggest speedup factor was perhaps this little 
trick:

+               if (need_resched())
+                       break;

this allows the spin-mutex to only waste CPU time if there's no work 
around on that CPU. (i.e. if there's no other task that wants to run) The 
moment there's some other task, we context-switch to it.

Very elegant concept i think.

[ A detail, -tip testing found that the patch breaks mutex debugging:

  =====================================
  [ BUG: bad unlock balance detected! ]
  -------------------------------------
  swapper/0 is trying to release lock (cpu_add_remove_lock) at:
  [<ffffffff8089f540>] mutex_unlock+0xe/0x10
  but there are no more locks to release!

 but that's a detail for -v7 ;-) ]

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux