Re: gcc 4.9.0 and cilkplus high kernel cpu usage?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Cilk is total nonsense to use of course,
except if you have a NCSA supercomputer of half a billion,
and you are too lazy to make something that uses it efficiently.

I remember a comparision of Cilkchess here at my home,
where cilkchess lost a factor 40 somewhere thanks to Cilk :)

Don Dailey - i remember him sitting over here (in my house) and
playing with cilk remote versus Diep.

Then after cilkchess lost 4 games or so - getting single core 5000 positions a second, Don said: "ok now it's time to play without Cilk".

At first i didn't know what he referred to, yet he referred to using
the program at his laptop without using Cilk. It got 200k positions a second. Factor 40 faster :)

We can mathematically prove why it is tough for applications that need low latency, to use cilk.

The overhead is HUGE.



On Mon, 28 Apr 2014, Stefan Ruppert wrote:

Hi,

in the last few days I wanted to test the cilkplus feature of gcc 4.9. The standard fibonacci example works fine here. But my program has a high kernel cpu usage and is slower as the non-cilkplus (single threaded) version.

My program calculates the minimum distance route between passed cities in germany. It builds up a complete tree where the root is the start of the route and each leave is a possible end of the route.

A route with 10 cities needs about 2.1 seconds in the non-cilk version. The cilk version which spawns 4 cilk tasks need about 2.5 seconds:

Non-cilk (single threaded) version:
$ time ./myroute 65830 60306 55130 Sörgenloch 25849 65439 52388 Berlin München Hamburg

leaves: 362880

route: |-- Kriftel --[8.14497km]--> Wicker, Main-Taunus- Kreis --[9.55325km]--> Weisenau --[12.6317km]--> Sörgenloch --[148.816km]--> Wissersheim --[160.604km]--> Frankfurt am Main --[217.106km]--> München --[354.114km]--> Berlin --[255.292km]--> Hamburg --[137.506km]--> Westertilli--| total distance is 1303.77km.

real	0m2.118s
user	0m2.040s
sys	0m0.032s


Cilk-version with 4-worker threads:
$ time ./myroute -c 65830 60306 55130 Sörgenloch 25849 65439 52388 Berlin München Hamburg

leaves: 362880

route: |-- Kriftel --[8.14497km]--> Wicker, Main-Taunus- Kreis --[9.55325km]--> Weisenau --[12.6317km]--> Sörgenloch --[148.816km]--> Wissersheim --[160.604km]--> Frankfurt am Main --[217.106km]--> München --[354.114km]--> Berlin --[255.292km]--> Hamburg --[137.506km]--> Westertilli--| total distance is 1303.77km.

real	0m2.564s
user	0m3.972s
sys	0m4.468s

Also I find out that when setting the number of workers to 2 I get a slightly faster response time as the non-cilk version:

Cilk-version with 2-worker threads:
$ time ./myroute -c 65830 60306 55130 Sörgenloch 25849 65439 52388 Berlin München Hamburg

leaves: 362880

route: |-- Kriftel --[8.14497km]--> Wicker, Main-Taunus- Kreis --[9.55325km]--> Weisenau --[12.6317km]--> Sörgenloch --[148.816km]--> Wissersheim --[160.604km]--> Frankfurt am Main --[217.106km]--> München --[354.114km]--> Berlin --[255.292km]--> Hamburg --[137.506km]--> Westertilli--| total distance is 1303.77km.

real	0m2.045s
user	0m2.452s
sys	0m0.988s

Any idea why the kernel cpu usage is so high?

Regards,
Stefan

PS: Here is my config:
I build gcc 4.9 from source with the following options:

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/opt/devel/build/gcc-4.9.0/libexec/gcc/x86_64-unknown-linux-gnu/4.9.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-4.9.0/configure --prefix=/opt/devel/build/gcc-4.9.0 --with-system-zlib --with-gmp=/opt/devel/build/gcc-4.9.0 --with-mpfr=/opt/devel/build/gcc-4.9.0 --with-cloog=/opt/devel/build/gcc-4.9.0 --with-mpc=/opt/devel/build/gcc-4.9.0 --with-tune=generic --enable-languages=c,c++ --enable-multilib --with-multilib-list=m32,m64
Thread model: posix
gcc version 4.9.0 (GCC)

$ uname -a
Linux myarm 3.5.0-34-generic #55-Ubuntu SMP Thu Jun 6 20:18:19 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux


[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux