Re: Need help with illegal instruction errors in COPR

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/15/18 1:11 PM, Orion Poplawski wrote:
I'm testing out rebuilding packages with openmpi 3.1 in COPR:

https://copr.fedorainfracloud.org/coprs/g/scitech/openmpi3.1/builds/

A number of packages are failing running tests only on Fedora Rawhide x86_64 with processes killed with signal 4 (Illegal instruction).  For example:

+ PYTHONPATH=/builddir/build/BUILDROOT/mpi4py-3.0.0-6.git39ca78422646.fc30.x86_64/usr/lib64/python2.7/site-packages/openmpi + mpiexec -n 1 python2 test/runtests.py -v --no-builddir --thread-level=serialized -e spawn BUILDSTDERR: --------------------------------------------------------------------------
BUILDSTDERR: Primary job  terminated normally, but 1 process returned
BUILDSTDERR: a non-zero exit code. Per user-direction, the job has been aborted. BUILDSTDERR: -------------------------------------------------------------------------- BUILDSTDERR: -------------------------------------------------------------------------- BUILDSTDERR: mpiexec noticed that process rank 0 with PID 0 on node 656ae442c6bf45fe9b45c5481f41bc45 exited on signal 4 (Illegal instruction). BUILDSTDERR: --------------------------------------------------------------------------

Unfortunately I have been unable to reproduce this in any local mock builds.  So I'm left wondering if this is some kind of peculiarity with the COPR builders or if there is a real problem with openmpi.  Any suggestions for how to further debug this would be greatly appreciated.

(PID 0 seems very odd)

- Orion



With the help of the very useful libSegFault I was able to generate a backtrace in COPR with:

export LD_PRELOAD=/usr/lib64/libSegFault.so
export SEGFAULT_SIGNALS=ill

which pointed to the libpsm2 library. I've filed https://bugzilla.redhat.com/show_bug.cgi?id=1659852

Any Koji/COPR debug tips pages out there that could benefit from mentioning libSegFault/catchsegv?

--
Orion Poplawski
Manager of NWRA Technical Systems          720-772-5637
NWRA, Boulder/CoRA Office             FAX: 303-415-9702
3380 Mitchell Lane                       orion@xxxxxxxx
Boulder, CO 80301                 https://www.nwra.com/
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux