Re: Help with flexiblas crash on aarch64 in kojij only

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/4/25 11:33, Orion Poplawski wrote:
Since the latest update to OpenBLAS 0.3.28 in rawhide, FlexiBLAS fails to build in aarch64 because OpenBLAS crashes in the LAPACK- xeigtstc_cec_in test. Note that OpenBLAS itself does not fail only because they don't include LAPACK test suite.

See:
- The first failure in Koschei after the 0.3.28 update: https:// koschei.fedoraproject.org/package/flexiblas - The build log: https://koji.fedoraproject.org/koji/taskinfo? taskID=125998498

FTBFS report here: https://bugzilla.redhat.com/show_bug.cgi?id=2329491

I have attempted to collect some more debug info via the following - https://src.fedoraproject.org/fork/orion/rpms/flexiblas/tree/debug

But the valgrind run just seems to hang with no output from valgrind - https://kojipkgs.fedoraproject.org//work/tasks/3875/127513875/build.log

  Tests of the Nonsymmetric eigenproblem condition estimation routines
  CTRSYL, CTREXC, CTRSNA, CTRSEN
  Relative machine precision (EPS) =     0.119209E-06
  Safe minimum (SFMIN)             =     0.117549E-37
  Routines pass computational tests if test ratio is less than   20.00
  CEC routines passed the tests of the error exits ( 41 tests done)

And the crash seems to occur after memory corruption has already occurred so seems to be of limited utility.  So I'm at a loss myself.

So, looks like I'm just not waiting long enough for the test to progress - valgrind must be adding a huge overhead. I'm now seeing:

==44481== Thread 10:
==44481== Invalid read of size 4
==44481== at 0x6182DC4: cgemm_beta_NEOVERSEN1 (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E29A03: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E9A0FB: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E9A4D7: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x70AEF1F: ??? (in /usr/lib64/libgomp.so.1.0.0)
==44481==    by 0x4F1D187: start_thread (in /usr/lib64/libc.so.6)
==44481==    by 0x4F8729B: thread_start (in /usr/lib64/libc.so.6)
==44481== Address 0x53cd9e0 is 0 bytes after a block of size 111,504 alloc'd
==44481==    at 0x48854F0: malloc (vg_replace_malloc.c:446)
==44481==    by 0x10C6CB: csyl01_ (csyl01.f:151)
==44481==    by 0x1104B3: cchkec_.constprop.0 (cchkec.f:129)
==44481==    by 0x119D4F: MAIN__ (cchkee.F:1271)
==44481==    by 0x10C327: main (cchkee.F:2553)
==44481==
==44481== Invalid read of size 4
==44481== at 0x6182DCC: cgemm_beta_NEOVERSEN1 (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E29A03: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E9A0FB: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E9A4D7: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x70AEF1F: ??? (in /usr/lib64/libgomp.so.1.0.0)
==44481==    by 0x4F1D187: start_thread (in /usr/lib64/libc.so.6)
==44481==    by 0x4F8729B: thread_start (in /usr/lib64/libc.so.6)
==44481== Address 0x53cd9e8 is 8 bytes after a block of size 111,504 alloc'd
==44481==    at 0x48854F0: malloc (vg_replace_malloc.c:446)
==44481==    by 0x10C6CB: csyl01_ (csyl01.f:151)
==44481==    by 0x1104B3: cchkec_.constprop.0 (cchkec.f:129)
==44481==    by 0x119D4F: MAIN__ (cchkee.F:1271)
==44481==    by 0x10C327: main (cchkee.F:2553)
==44481==
==44481== Invalid write of size 4
==44481== at 0x6182DF4: cgemm_beta_NEOVERSEN1 (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E29A03: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E9A0FB: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E9A4D7: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x70AEF1F: ??? (in /usr/lib64/libgomp.so.1.0.0)
==44481==    by 0x4F1D187: start_thread (in /usr/lib64/libc.so.6)
==44481==    by 0x4F8729B: thread_start (in /usr/lib64/libc.so.6)
==44481== Address 0x53cd9e0 is 0 bytes after a block of size 111,504 alloc'd
==44481==    at 0x48854F0: malloc (vg_replace_malloc.c:446)
==44481==    by 0x10C6CB: csyl01_ (csyl01.f:151)
==44481==    by 0x1104B3: cchkec_.constprop.0 (cchkec.f:129)
==44481==    by 0x119D4F: MAIN__ (cchkee.F:1271)
==44481==    by 0x10C327: main (cchkee.F:2553)
==44481==
==44481== Invalid write of size 4
==44481== at 0x6182DF8: cgemm_beta_NEOVERSEN1 (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E29A03: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E9A0FB: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E9A4D7: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x70AEF1F: ??? (in /usr/lib64/libgomp.so.1.0.0)
==44481==    by 0x4F1D187: start_thread (in /usr/lib64/libc.so.6)
==44481==    by 0x4F8729B: thread_start (in /usr/lib64/libc.so.6)
==44481== Address 0x53cd9e8 is 8 bytes after a block of size 111,504 alloc'd
==44481==    at 0x48854F0: malloc (vg_replace_malloc.c:446)
==44481==    by 0x10C6CB: csyl01_ (csyl01.f:151)
==44481==    by 0x1104B3: cchkec_.constprop.0 (cchkec.f:129)
==44481==    by 0x119D4F: MAIN__ (cchkee.F:1271)
==44481==    by 0x10C327: main (cchkee.F:2553)
==44481==
==44481== Invalid read of size 4
==44481== at 0x6182DC4: cgemm_beta_NEOVERSEN1 (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E36BC7: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E9A0FB: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E9A4D7: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x70AEF1F: ??? (in /usr/lib64/libgomp.so.1.0.0)
==44481==    by 0x4F1D187: start_thread (in /usr/lib64/libc.so.6)
==44481==    by 0x4F8729B: thread_start (in /usr/lib64/libc.so.6)
==44481== Address 0x53cd9e0 is 0 bytes after a block of size 111,504 alloc'd
==44481==    at 0x48854F0: malloc (vg_replace_malloc.c:446)
==44481==    by 0x10C6CB: csyl01_ (csyl01.f:151)
==44481==    by 0x1104B3: cchkec_.constprop.0 (cchkec.f:129)
==44481==    by 0x119D4F: MAIN__ (cchkee.F:1271)
==44481==    by 0x10C327: main (cchkee.F:2553)
==44481==
==44481== Invalid read of size 4
==44481== at 0x6182DCC: cgemm_beta_NEOVERSEN1 (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E36BC7: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E9A0FB: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E9A4D7: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x70AEF1F: ??? (in /usr/lib64/libgomp.so.1.0.0)
==44481==    by 0x4F1D187: start_thread (in /usr/lib64/libc.so.6)
==44481==    by 0x4F8729B: thread_start (in /usr/lib64/libc.so.6)
==44481== Address 0x53cd9e8 is 8 bytes after a block of size 111,504 alloc'd
==44481==    at 0x48854F0: malloc (vg_replace_malloc.c:446)
==44481==    by 0x10C6CB: csyl01_ (csyl01.f:151)
==44481==    by 0x1104B3: cchkec_.constprop.0 (cchkec.f:129)
==44481==    by 0x119D4F: MAIN__ (cchkee.F:1271)
==44481==    by 0x10C327: main (cchkee.F:2553)
==44481==
==44481== Invalid write of size 4
==44481== at 0x6182DF4: cgemm_beta_NEOVERSEN1 (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E36BC7: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E9A0FB: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E9A4D7: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x70AEF1F: ??? (in /usr/lib64/libgomp.so.1.0.0)
==44481==    by 0x4F1D187: start_thread (in /usr/lib64/libc.so.6)
==44481==    by 0x4F8729B: thread_start (in /usr/lib64/libc.so.6)
==44481== Address 0x53cd9e0 is 0 bytes after a block of size 111,504 alloc'd
==44481==    at 0x48854F0: malloc (vg_replace_malloc.c:446)
==44481==    by 0x10C6CB: csyl01_ (csyl01.f:151)
==44481==    by 0x1104B3: cchkec_.constprop.0 (cchkec.f:129)
==44481==    by 0x119D4F: MAIN__ (cchkee.F:1271)
==44481==    by 0x10C327: main (cchkee.F:2553)
==44481==
==44481== Invalid write of size 4
==44481== at 0x6182DF8: cgemm_beta_NEOVERSEN1 (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E36BC7: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E9A0FB: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x5E9A4D7: ??? (in /usr/lib64/libopenblaso-r0.3.28.so)
==44481==    by 0x70AEF1F: ??? (in /usr/lib64/libgomp.so.1.0.0)
==44481==    by 0x4F1D187: start_thread (in /usr/lib64/libc.so.6)
==44481==    by 0x4F8729B: thread_start (in /usr/lib64/libc.so.6)
==44481== Address 0x53cd9e8 is 8 bytes after a block of size 111,504 alloc'd
==44481==    at 0x48854F0: malloc (vg_replace_malloc.c:446)
==44481==    by 0x10C6CB: csyl01_ (csyl01.f:151)
==44481==    by 0x1104B3: cchkec_.constprop.0 (cchkec.f:129)
==44481==    by 0x119D4F: MAIN__ (cchkee.F:1271)
==44481==    by 0x10C327: main (cchkee.F:2553)
==44481==

But I'm back to not having access to openblas debuginfo in koji.

Maybe I can reproduce the test failure somehow as part of the openblas build.



--
Orion Poplawski
he/him/his  - surely the least important thing about me
IT Systems Manager                         720-772-5637
NWRA, Boulder/CoRA Office             FAX: 303-415-9702
3380 Mitchell Lane                       orion@xxxxxxxx
Boulder, CO 80301                 https://www.nwra.com/

<<attachment: smime.p7s>>

-- 
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux