I am a first-time poster to the gcc lists and not intimately familiar with
compilers so please understand if I don't offer all of the relevant
information up-front. I am in the process of porting a large application
from gcc 2.95.3 with libstdc++ 2.10.0 to gcc 3.4.6 with libstdc++ 6.0.3. All
of my development is being done on a dual AMD Opteron machine running
Solaris 10. I have attempted to upgrade my compiler several times over the
past few years but have always had the same problem, which is application
hanging and driving CPU utilization up to 100%. This was impossible to debug
on a single processor machine, but now binding my application to one
processor on a dual-processor machine allows the machine to remain
responsive when the app goes to 100% on one CPU giving me an opportunity to
debug.
I am now determined to find the cause of this hanging with 100% CPU. It is
obvious from stack traces that several threads are hung inside of either
__gnu_cxx::__exchange_and_add () or __gnu_cxx::__atomic_add (), and I
suspect these functions in dead-lock are responsible for driving CPU
utilization to 100%. Unfortunately, so far I am unable to find any reason
that these would or could dead-lock. Most often these functions are being
called from a std::basic_string constructor or destructor, but I am also
seeing it on occasion inside of a std::locale constructor or destructor
within std::basic_stringstream constructor or destructor. Perhaps related to
reference counting? I am not certain, but I suspect that the cause of this
is not necessarily with the particular std::string or std::stringstream
instances that are showing up repeatedly in my stack traces as I have
exhaustively checked and re-checked and they seem completely kosher. This
leads me to believe that it is a bug elsewhere in my code (or in an included
library?) that is indirectly causing these problems, and this is what has
brought me to this list.
I am wondering first of all if there is anyone on this list that has had a
similar experience and may have suggestions on how to resolve. I am also
hoping for any advice on what could possibly cause this, what type of
problematic code could lead to this, or what steps I could take or what
debugging options are available in libstdc++ to help me isolate this. Below
are examples of typical compile and link commands showing switches, libs,
defines, etc. Any assistance or direction with this would be immensely
appreciated.
Thanks and Regards,
Chad.
g++ -g -c -Wall -DNDEBUG -D_REENTRANT -DSOLARIS -DX86 -DSUNOS -I../../solaris10_x86/snmpinc
-I../../solaris10_x86/mysqlinc -I../../BRTools -I../../BRSocket -I../../BRRTP
-I../../MP -I../../MPProxy -I../ -D_GLIBCXX_DEBUG -I./ XProxyMain.cpp -o
./XProxyMain.o
g++ XProxyMain.o XProxy.o XClient.o ../../MPProxy/libMPProxy.a
../../MP/libMP.a ../../BRSIP/libBRSIP.a ../../BRRTP/libBRRTP.a
../../BRSocket/libBRSocket.a
/../BRTools/libBRTools.a -L../../solaris10_x86/snmplib -lucdagent -lucdmibs
-lsnmp ../../solaris10_x86/mysqllib/libmysqlclient_r.a
/usr/local/lib/libstdc++.a -mt -lposix4 -lpthread -lresolv -lsocket -lnsl -lm
-lz -ldl -lkstat -lkvm -o ./XProxy
Compiler (from sunfreeware.com) version below:
Reading specs from /usr/local/lib/gcc/i386-pc-solaris2.10/3.4.6/specs
Configured with:
../configure --with-as=/usr/ccs/bin/as --with-ld=/usr/ccs/bin/ld --enable-shared
--enable-languages=c,c++,f77
Thread model: posix
gcc version 3.4.6
Same results when compiled with gcc 3.4.3 that ships with Solaris 10,
version below:
Reading specs from /usr/sfw/lib/gcc/i386-pc-solaris2.10/3.4.3/specs
Configured with:
/builds/sfw10-gate/usr/src/cmd/gcc/gcc-3.4.3/configure --prefix=/usr/sfw --with-as=/usr/sfw/bin/gas
--with-gnu-as --with-ld=/usr/ccs/bin/ld --without-gnu-ld --enable-languages=c,c++
--enable-shared
Thread model: posix
gcc version 3.4.3 (csl-sol210-3_4-branch+sol_rpath)