Re: GCC compiles but code crashes. Works w/ Intel compiler

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Jun 18, 2023 at 8:52 AM Ken Mankoff via Gcc-help <
gcc-help@xxxxxxxxxxx> wrote:

> Hi Jonathan,
>
> On 2023-06-18 at 01:10 -07, Jonathan Wakely <jwakely.gcc@xxxxxxxxx>
> wrote...
> > On Sun, 18 Jun 2023, 01:58 Ken Mankoff via Gcc-help,
> > <gcc-help@xxxxxxxxxxx>
> > wrote:
> >
> >> I'm trying to rebuild everything using GNU/gcc.
> >
> >
> > What does this mean? Most people here are not familiar with Spack, and
> > I have no idea what it means to rebuild using GNU/gcc. Do you just
> > mean using gcc instead of the Intel compiler?
>
> By 'rebuild' I meant 'compile'. Yes - I'm trying to compile using gcc
> instead of Intel. Spack is a package manager. Perhaps not useful
> information.
>
> >> I also now have all the dependencies rebuilt with GNU (lots of
> >> guesswork there). It runs for 1 day. It fails on day 2 when the
> >> coupling between the models is done for the first time.
> >
> > Fails how?
> >
> > It crashes? How? What causes it to crash? What does gdb show?
>
> An array contains a value (1.8e+215) causing an assert to fail. I provided
> gdb output.
>
>
> Perhaps also useful - this same thing occurs on two different machines:
>
> $ lsb_release -a
> Description:    Ubuntu 22.04.2 LTS
>
> $ uname -a
> Linux t480 5.15.0-72-generic #79-Ubuntu SMP Wed Apr 19 08:22:18 UTC 2023
> x86_64 x86_64 x86_64 GNU/Linux
>
> $ gcc --version
> gcc (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0
>
> And
>
> $ lsb_release -a
> Description:    SUSE Linux Enterprise Server 12 SP5
> Release:        12.5
>
> $ uname -a
> Linux discover12 4.12.14-122.156-default #1 SMP Wed Apr 5 06:49:18 UTC
> 2023 (026e398) x86_64 x86_64 x86_64 GNU/Linux
>
> $ gcc --version
> gcc (GCC) 12.1.0
>
>
>   -k.
>

Hmm... what an interesting problem. The Intel compiler works but gcc does
not, is that correct? If so, narrowing down the difference between Intel
and gcc could be key to unraveling the issue. If I understood correctly
though you're saying several other changes besides the compiler swap was
done. For the sake of argument assume there is a bug in the source code and
what's happening is code generated by gcc is revealing that bug. In
contrast, something the Intel compiler is doing lets it slip by. We can
disregard any concern about GNU/Linux version or specific gcc version
because you show SUSE and Ubuntu and gcc 11.3 and 12.1.0 show the same
problem.

Is there any way to narrow down the scope? Remove as much unrelated source
and compile steps as possible while still demonstrating the problem?

Finally I would caution against ~785th element becoming too narrow a focus.
Of course seeing data at the point of failure likely is telling you
something. But.... is it possible this is just a side effect? That the
actual error occurred earlier and went unnoticed?

Wish you all the best with this. Cheers, -Randy




[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux