Re: GCC compiles but code crashes. Works w/ Intel compiler

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 18 Jun 2023, 01:58 Ken Mankoff via Gcc-help, <gcc-help@xxxxxxxxxxx>
wrote:

> Hi,
>
> I apologize in advance for a complex and poorly defined question/bug. I'd
> love to provide an MWE but cannot. I'm working with a very complex, large,
> and historical codebase. We've coupled a FORTRAN NASA global climate model
> https://simplex.giss.nasa.gov/snapshots/ with a C++ ice sheet model
> https://www.pism.io/ .
>
> Everything runs when this (and ~20 dependencies) are compiled with Intel
> and Spack on our supercomputer.
>
> I'm trying to rebuild everything using GNU/gcc.


What does this mean? Most people here are not familiar with Spack, and I
have no idea what it means to rebuild using GNU/gcc.

Do you just mean using gcc instead of the Intel compiler?


Each tool runs stand-alone with GCC on both the supercomputer and my
> laptop.
>
> I also now have all the dependencies rebuilt with GNU (lots of guesswork
> there). It runs for 1 day. It fails on day 2 when the coupling between the
> models is done for the first time.


Fails how?

It crashes? How? What causes it to crash? What does gdb show?

There is nothing here that anybody can help with, as all you've said is
that you have a program that crashes.


I've traced the error to the ~785th element of a 10k element array that
> blows up. I get the same error on both my laptop and on our supercomputer.
>
> When add a PRINT statement to the Intel/Spack install, I see:
>
> i: 780 deltah(i): 0.83826
> i: 781 deltah(i): 0.849428
> i: 782 deltah(i): 0.856929
> i: 783 deltah(i): 0
> i: 784 deltah(i): 0.910464
> i: 785 deltah(i): 0.747764
> i: 786 deltah(i): 0.774704
> i: 787 deltah(i): 0.858931
> i: 788 deltah(i): 0.823518
> i: 789 deltah(i): 0.939335
> i: 790 deltah(i): 0
>
> And when I go to the same place in gdb on the GNU version after it
> crashes, I see the following (note the first 4 numbers shown here are
> almost identical, as are indices 0 to 780 - presumably just
> hardware/compiler differences).
>
> frame 11 # move to `icebin/slib/icebin/contracts/modele_pism.cpp:111`
> p deltah.data_[780]@10
> # $22 = {0.83825857976891971, 0.84942598585903251, 0.85692695613342984,
> 0, 0.41908462753569942, 3.3390526779554853e-313, 3.2211851062927428e-311,
> 4.2653676902411122e-311, 1.8555380963204083e+251, 0 <repeats 11 times>}
>
>
> There are of course many changes other than just Intel/GNU: When changing
> from Spack to Spackless, I had to figure out/guess how to deal with ~30
> dependencies. About 10 moved into a Conda/mamba environment, and I
> hand-build the rest. I've guessed at many Cmake and configure commands.
> There's lots of places where I could have introduced a problem, and I
> introduced plenty. But I've solved enough that it compiles, runs, and the
> data looks good for the first ~785 elements of this array.
>
> That makes me think, for some reason I'm not sure why, that maybe it's a
> compiler issue. If anyone has any suggestion how to debug this further, I'd
> be happy to hear it.
>
> Thank you,
>
>   Ken Mankoff
>
>
> FYI, here is how I build one of the many dependencies, the 'icebin' tool
> above where the crash occurs (although I believe `deltah`, a Blitz array,
> is generated elsewhere in Fortran, but the crash occurs here in C++).
>
>
> CC="${LIME_ROOT}/opt/bin/mpicc" \
>   CXX="${LIME_ROOT}/opt/bin/mpicxx" \
>   FC="${LIME_ROOT}/opt/bin/mpif90" \
>   PETSC_DIR="${LIME_ROOT}/src/petsc-3.7.7" \
>   PETSC_ARCH="arch-linux2-c-debug" \
>   cmake .. \
>   -D CMAKE_INSTALL_PREFIX=${LIME_ROOT}/opt \
>   -D CMAKE_C_FLAGS="-DNDEBUG -O0 -ggdb3 -fpermissive -fPIC
> -I${MAMBA_ENV}/meli/lib/python3.11/site-packages/numpy/core/include" \
>   -D CMAKE_CXX_FLAGS="-DNDEBUG -O0 -ggdb3 -fpermissive -fPIC
> -I${MAMBA_ENV}/meli/lib/python3.11/site-packages/numpy/core/include" \
>   -D CMAKE_PREFIX_PATH="${LIME_ROOT}/opt/include/boost:${MAMBA_ENV}/meli" \
>   -D
> CMAKE_IGNORE_PATH="/usr;/lib;/usr/include;/usr/lib;/usr/lib64;/usr/bin" \
>   -D BUILD_COUPLER=YES \
>   -D BUILD_MODELE=YES \
>   -D BUILD_GRIDGEN=YES \
>   -D BUILD_PYTHON=YES \
>   -D USE_PISM=YES \
>   -D Boost_INCLUDE_DIR=${LIME_ROOT}/opt/include \
>   -D Boost_INCLUDE_DIRS=${LIME_ROOT}/opt/include \
>   -D Boost_LIBRARY_DIRS=${LIME_ROOT}/opt/lib \
>   -D BLITZ_ROOT=${LIME_ROOT}/opt \
>   -D BLITZ_LIBRARY=${LIME_ROOT}/opt/lib/libblitz.so \
>   -D CGAL_LIBRARY=${LIME_ROOT}/opt/lib/libCGAL.so \
>   -D CGAL_INCLUDE_DIR=${LIME_ROOT}/opt/include \
>   -D CYTHON_EXECUTABLE=${MAMBA_ENV}/meli/bin/cython \
>   -D EIGEN3_INCLUDE_DIR=${MAMBA_ENV}/meli/include/eigen3 \
>   -D EVERYTRACE_c_REFADDR=${LIME_ROOT}/opt/lib \
>   -D EVERYTRACE_INCLUDE_DIR=${LIME_ROOT}/opt/include \
>   -D EVERYTRACE_LIBRARY=${LIME_ROOT}/opt/lib/libeverytrace.so \
>   -D GMP_INCLUDE_DIR=${MAMBA_ENV}/meli/include \
>   -D GMP_LIBRARY=${MAMBA_ENV}/meli/lib/libgmp.so \
>   -D GTEST_LIBRARY_MAIN=${MAMBA_ENV}/meli/lib/libgtest.so \
>   -D GTEST_INCLUDE_DIR=${MAMBA_ENV}/meli/include \
>   -D IBMISC_ROOT=${LIME_ROOT}/opt \
>   -D IBMISC_INCLUDE_DIR=${LIME_ROOT}/opt/include \
>   -D IBMISC_LIBRARY=${LIME_ROOT}/opt/lib/libibmisc.so \
>   -D MPFR_INCLUDES=${MAMBA_ENV}/meli/include \
>   -D MPFR_LIBRARIES=${MAMBA_ENV}/meli/lib/libmpfr.so \
>   -D MPIEXEC_EXECUTABLE=${LIME_ROOT}/opt/bin/mpiexec \
>   -D MPI_C_COMPILER=${LIME_ROOT}/opt/bin/mpicc \
>   -D MPI_CXX_COMPILER=${LIME_ROOT}/opt/bin/mpicxx \
>   -D MPI_Fortran_COMPILER=${LIME_ROOT}/opt/bin/mpif90 \
>   -D NETCDF_CXX4_LIBRARY=${LIME_ROOT}/opt/lib/libnetcdf-cxx4.so \
>   -D NETCDF_CXX4_INCLUDE_DIR=${LIME_ROOT}/opt/include \
>   -D PROJ4_INCLUDES=${MAMBA_ENV}/meli/include \
>   -D PROJ4_LIBRARIES=${MAMBA_ENV}/meli/lib/libproj.so \
>   -D PYTHON_EXECUTABLE=${MAMBA_ENV}/meli/bin/python \
>   -D PYTHON_LIBRARY=${MAMBA_ENV}/meli/lib/libpython3.so \
>   -D PYTHON_INCLUDES=${MAMBA_ENV}/meli/include/python3.11 \
>   -D TCLAP_INCLUDE_DIR=${MAMBA_ENV}/meli/include \
>   -D ZLIB_INCLUDE_DIR=${MAMBA_ENV}/meli/include \
>   -D ZLIB_LIBRARY=${MAMBA_ENV}/meli/lib/libz.so \
>   -Wno-dev
>
> make -j
> make install
>
>



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux