Re: libstdc++ and openmp problem with GCC4.4.0 port to interix

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,
Still trying to solve the problems porting libstdc++ in GCC 4.4.0 to interix I notice that when I compile the test code below with flags  -fopenmp -D_GLIBCXX_PARALLEL -O3 it works as expected at a decent speed. But if I omit the _GLIBCXX_PARALLEL flag then the weird behaviour with high CPU kernel times and slow execution occures whenever more than one thread is requested in the program. 

Are there any libstdc++ gurus out there who knows what these symptoms might mean?

Many thanks,

Rob

 
----- Original Message ----- 
From: "Robert Oeffner" <robert@xxxxxxxxxxx>
To: <gcc-help@xxxxxxxxxxx>
Sent: Saturday, September 26, 2009 11:20 AM
Subject: libstdc++ and openmp problem with GCC4.4.0 port to interix


> Hi,
> 
> Probably a long shot but I wonder if anyone would have a useful tip on a 
> problem porting gcc4.4.0 to interix (a BSD-like OS running on top of the 
> Windows kernel).
> 
> As libgomp in GCC so far isn't targeting interix I have made some changes to 
> libgomp in my copy of the GCC 4.4.0 distribution.  A new source file was 
> created, gcc-4.4.0/libgomp/config/posix/interix/proc.c, which is templated 
> on the existing gcc-4.4.0/libgomp/config/posix/proc.c and 
> gcc-4.4.0/libgomp/config/posix/mingw32/proc.c in the distribution (see 
> http://www.oeffner.net/stuff/gcc-4.4.0_interix_changes.zip or 
> http://www.suacommunity.com/forum/tm.aspx?m=16600 ). With this file and 
> modifications to GCC configuration files in the distribution I can bootstrap 
> GCC 4.4.0 to build gcc and g++ compilers on interix.
> 
> The port produces fast code for single threaded running programs. However, 
> there's a major problem with OpenMP. It's something to do with libstdc++ 
> that tends to go in overdrive when you request OpenMP to create more than 
> one thread for the compiled program. When calling string::clear() from 
> libstdc++ it somehow hogs the CPU with high kernel times and runs orders of 
> magnitudes slower. The code below demonstrates the problem. It runs fast 
> when using just one thread but abysmally slow when two or more threads are 
> present, even though the loop doing the work is actually single threaded and 
> the other threads remain idle.
> Windows Taskmanager shows that execution times is roughly 50% kernel and 50% 
> user time whenever you run more than one thread. Invoked with a single 
> thread execution time is just spend in user mode.
> 
> As far as I know releasing and locking data objects is done by the OS on 
> behalf of a programs request and it's done in kernel mode. Are there 
> situations where libstdc++ may be confused about idle threads in a program 
> and then do unnecessary requests for locking and releasing data objects?
> 
> If there is anyone who has a suggestion on what causes these symptoms in my 
> GCC port that would be greatly appreciated.
> 
> Many thanks,
> 
> Rob
> 
> 
> #include <iostream>
> #include <omp.h>
> 
> using namespace std;
> 
> const long lmax = 50000;
> 
> int main()
> {
>    int nthreads = 1;
>    cout<<"Enter number of OpenMP threads to create: ";
>    cin >> nthreads;
>    omp_set_num_threads(nthreads);
> 
> #pragma omp parallel
>    {
> #pragma omp single
>        cout << "Doing string stuff with "<<omp_get_num_threads()<<"
> thread(s)"<<endl;
>    }
> 
>    time_t start, now;
>    time( &start );
> 
>    string pairlbl("");
> 
>    for (long m = 0; m< lmax; m++)
>    {
>        if ((m % (lmax/20))==0)
>        cout << "m = "<<m<<endl;
> 
>        for (int j=1;j<=2000;j++)
>        {
>            pairlbl.clear();
>        }
>    }
> 
>    time( &now);
>    cout<<"\ntime= "<<difftime( now, start )<<" sec\n";
> 
>    return 0;
> }
> 
> 
> 
> 
>



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux