Re: test-mesh-crypto segfaults

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Brian,

One thing I forgot to mention explicitly: this is the recently released bluez-5.52 tarball, not git master(!).
Also note that openSUSE Factory use LTO by default now.

from config.log:

It was created by bluez configure 5.52, which was
generated by GNU Autoconf 2.69.  Invocation command line was
  $ ./configure --host=x86_64-suse-linux-gnu --build=x86_64-suse-linux-gnu --program-prefix=
--disable-dependency-tracking --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc
--datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/lib --localstatedir=/var
--sharedstatedir=/var/lib --mandir=/usr/share/man --infodir=/usr/share/info --disable-dependency-tracking
--disable-silent-rules --enable-pie --enable-library --enable-tools --enable-cups --enable-mesh --enable-midi
--enable-test --enable-experimental --enable-deprecated --enable-datafiles --enable-sixaxis
--with-systemdsystemunitdir=/usr/lib/systemd/system --with-systemduserunitdir=/usr/lib/systemd/user

[...]

ac_cv_env_CFLAGS_value='-O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables
-fasynchronous-unwind-tables -fstack-clash-protection -Werror=return-type -flto=auto -g'


Am 04.11.19 um 23:11 schrieb Gix, Brian:
> Hi Stefan,
> 
> On Mon, 2019-11-04 at 22:16 +0100, Stefan Seyfried wrote:
>> Hi,
>>
>> test-mesh-crypto segfaults for me.
>>
>> abuild@strolchi:~/rpmbuild/BUILD/bluez-5.52> gdb unit/test-mesh-crypto
>>
>> [....]
>>
>> [8.6.2 Service Data using Node Identity]
>> ID Resolving Key     = 84396c435ac48560b5965385253e210c
>>                        84396c435ac48560b5965385253e210c => PASS
>> Hash Input           = 00000000000034ae608fbbc1f2c61201
>>                        00000000000034ae608fbbc1f2c61201 => PASS
>> Hash                 = 00861765aefcc57b
>>                        00861765aefcc57b => PASS
>> Mesh ID Beacon       = 0100861765aefcc57b34ae608fbbc1f2c6
>>                        0100861765aefcc57b34ae608fbbc1f2c6 => PASS
>>
> 
> This is very strange.  The unit test looks like it has completed successfully (at least from what you copy-
> pasted).  That test 8.6.2 is the final test, and it looks happy.  Can you verify for me that all the other
> tests completed successfully?

yes, all previous tests passed successfully.

> There is a *small* chance that you could be running on an old kernel:  Kernels version 4.8 and before had a 
> bug in the AEAD crypto code that made mesh code (including this unit test) inoperable...  But this should show
> up *first* in one of the earlier tests within this unit test... so look for any FAIL designations.

This is 5.4.0rc5 right now.

> The other thing about this particular test is that it is the *only* bluez unit test which is based on ELL
> (Embedded Linux Library) instead of GLIB.
> 
> Are you running from an installed ELL when compiling bluez, or do you have ELL in a "peer directory"... For
> instance:
> 
> .../work/ell
> .../work/bluez 

no, I am using the embedded ell library AFAICT. I had "ell-devel" installed in the build root, but it is not used
apparently. Nothing changes if I do not install ell-devel. ell-devel and libell0 are old versions (0.23), but again,
removing them from the buildroot does not change things.

> If you have the ELL source code available, please try running:
> ell % make check
> 
> paying particular attention to the output of:
> unit/test-cipher
> 
> If unit/test-cipher in ELL passes, then unit/test-mesh-crypto in BLUEZ should pass.

I suppose it should work with the ell embedded in bluez-5.52 ;-)

>> Program received signal SIGSEGV, Segmentation fault.
>> 0x00007ffff7e874ae in mem2chunk_check () from /lib64/libc.so.6
>> (gdb) bt
>> #0  0x00007ffff7e874ae in mem2chunk_check () from /lib64/libc.so.6
>> #1  0x00007ffff7e8b6af in free_check () from /lib64/libc.so.6
>> #2  0x0000555555557c98 in l_free (ptr=<optimized out>) at ell/util.c:136
>> #3  l_queue_clear (queue=0x55555556d010, destroy=0x555555557c60
>> <l_free>) at ell/queue.c:107

I was intrigued by the lots of 0x55 bytes in these pointers(?) and checked if MALLOC_PERTURB_ and MALLOC_CHECK_ values
make a difference.

openSUSE Factory by default builds with

abuild@strolchi:~/rpmbuild/BUILD/bluez-5.52> echo $MALLOC_CHECK_
3
abuild@strolchi:~/rpmbuild/BUILD/bluez-5.52> echo $MALLOC_PERTURB_
69

unsetting both leads to an abort() instead of a segfault, but I think that's just a distraction:

~> unit/test-mesh-crypto
[...]
[8.6.2 Service Data using Node Identity]
ID Resolving Key     = 84396c435ac48560b5965385253e210c
                       84396c435ac48560b5965385253e210c => PASS
Hash Input           = 00000000000034ae608fbbc1f2c61201
                       00000000000034ae608fbbc1f2c61201 => PASS
Hash                 = 00861765aefcc57b
                       00861765aefcc57b => PASS
Mesh ID Beacon       = 0100861765aefcc57b34ae608fbbc1f2c6
                       0100861765aefcc57b34ae608fbbc1f2c6 => PASS
[8.6.2 Service Data using Node Identity]
ID Resolving Key     = 84396c435ac48560b5965385253e210c
                       84396c435ac48560b5965385253e210c => PASS
Hash Input           = 00000000000034ae608fbbc1f2c61201
                       00000000000034ae608fbbc1f2c61201 => PASS
Hash                 = 00861765aefcc57b
                       00861765aefcc57b => PASS
Mesh ID Beacon       = 0100861765aefcc57b34ae608fbbc1f2c6
                       0100861765aefcc57b34ae608fbbc1f2c6 => PASS

free(): invalid size
Aborted (core dumped)

in gdb:
Program received signal SIGABRT, Aborted.
0x00007ffff7e3d0d1 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x00007ffff7e3d0d1 in raise () from /lib64/libc.so.6
#1  0x00007ffff7e2655d in abort () from /lib64/libc.so.6
#2  0x00007ffff7e8045f in __libc_message () from /lib64/libc.so.6
#3  0x00007ffff7e878fc in malloc_printerr () from /lib64/libc.so.6
#4  0x00007ffff7e892ec in _int_free () from /lib64/libc.so.6
#5  0x0000555555557ca4 in l_free (ptr=<optimized out>) at ell/util.c:136
#6  l_queue_clear (queue=0x55555556d2a0, destroy=0x555555557c60 <l_free>) at ell/queue.c:111
#7  0x0000555555557210 in _sub_D_65535_1.7021 () at ell/cipher.c:319
#8  0x00007ffff7fe2c13 in _dl_fini () from /lib64/ld-linux-x86-64.so.2
#9  0x00007ffff7e3f877 in __run_exit_handlers () from /lib64/libc.so.6
#10 0x00007ffff7e3fa2c in exit () from /lib64/libc.so.6
#11 0x00007ffff7e27e12 in __libc_start_main () from /lib64/libc.so.6
#12 0x0000555555557b8a in _start () at ../sysdeps/x86_64/start.S:120

free(): invalid size
Aborted (core dumped)


Valgrind also complains, so I guess something is wrong in libell:

==28744== Invalid read of size 8
==28744==    at 0x10BC7E: l_queue_clear.constprop.0 (queue.c:101)
==28744==    by 0x10B20F: _sub_D_65535_1.7021 (in /home/abuild/rpmbuild/BUILD/bluez-5.52/unit/test-mesh-crypto)
==28744==    by 0x400FC12: _dl_fini (in /lib64/ld-2.30.so)
==28744==    by 0x4889876: __run_exit_handlers (in /lib64/libc-2.30.so)
==28744==    by 0x4889A2B: exit (in /lib64/libc-2.30.so)
==28744==    by 0x4871E11: (below main) (in /lib64/libc-2.30.so)
==28744==  Address 0x4a15040 is 0 bytes inside a block of size 24 free'd
==28744==    at 0x48389AB: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28744==    by 0x10B207: _sub_D_65535_1.7021 (in /home/abuild/rpmbuild/BUILD/bluez-5.52/unit/test-mesh-crypto)
==28744==    by 0x400FC12: _dl_fini (in /lib64/ld-2.30.so)
==28744==    by 0x4889876: __run_exit_handlers (in /lib64/libc-2.30.so)
==28744==    by 0x4889A2B: exit (in /lib64/libc-2.30.so)
==28744==    by 0x4871E11: (below main) (in /lib64/libc-2.30.so)
==28744==  Block was alloc'd at
==28744==    at 0x483777F: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28744==    by 0x10C80D: l_malloc (util.c:62)
==28744==    by 0x10B275: UnknownInlinedFun (queue.c:63)
==28744==    by 0x10B275: l_debug_add_section.constprop.0 (log.c:376)
==28744==    by 0x10B308: _sub_I_65535_0.7019 (queue.c:179)
==28744==    by 0x1119E4: __libc_csu_init (elf-init.c:88)
==28744==    by 0x4871D99: (below main) (in /lib64/libc-2.30.so)
==28744==
==28744== Invalid write of size 8
==28744==    at 0x10BCA9: l_queue_clear.constprop.0 (queue.c:114)
==28744==    by 0x10B20F: _sub_D_65535_1.7021 (in /home/abuild/rpmbuild/BUILD/bluez-5.52/unit/test-mesh-crypto)
==28744==    by 0x400FC12: _dl_fini (in /lib64/ld-2.30.so)
==28744==    by 0x4889876: __run_exit_handlers (in /lib64/libc-2.30.so)
==28744==    by 0x4889A2B: exit (in /lib64/libc-2.30.so)
==28744==    by 0x4871E11: (below main) (in /lib64/libc-2.30.so)
==28744==  Address 0x4a15040 is 0 bytes inside a block of size 24 free'd
==28744==    at 0x48389AB: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28744==    by 0x10B207: _sub_D_65535_1.7021 (in /home/abuild/rpmbuild/BUILD/bluez-5.52/unit/test-mesh-crypto)
==28744==    by 0x400FC12: _dl_fini (in /lib64/ld-2.30.so)
==28744==    by 0x4889876: __run_exit_handlers (in /lib64/libc-2.30.so)
==28744==    by 0x4889A2B: exit (in /lib64/libc-2.30.so)
==28744==    by 0x4871E11: (below main) (in /lib64/libc-2.30.so)
==28744==  Block was alloc'd at
==28744==    at 0x483777F: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28744==    by 0x10C80D: l_malloc (util.c:62)
==28744==    by 0x10B275: UnknownInlinedFun (queue.c:63)
==28744==    by 0x10B275: l_debug_add_section.constprop.0 (log.c:376)
==28744==    by 0x10B308: _sub_I_65535_0.7019 (queue.c:179)
==28744==    by 0x1119E4: __libc_csu_init (elf-init.c:88)
==28744==    by 0x4871D99: (below main) (in /lib64/libc-2.30.so)
==28744==
==28744== Invalid write of size 8
==28744==    at 0x10BCB1: l_queue_clear.constprop.0 (queue.c:115)
==28744==    by 0x10B20F: _sub_D_65535_1.7021 (in /home/abuild/rpmbuild/BUILD/bluez-5.52/unit/test-mesh-crypto)
==28744==    by 0x400FC12: _dl_fini (in /lib64/ld-2.30.so)
==28744==    by 0x4889876: __run_exit_handlers (in /lib64/libc-2.30.so)
==28744==    by 0x4889A2B: exit (in /lib64/libc-2.30.so)
==28744==    by 0x4871E11: (below main) (in /lib64/libc-2.30.so)
==28744==  Address 0x4a15048 is 8 bytes inside a block of size 24 free'd
==28744==    at 0x48389AB: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28744==    by 0x10B207: _sub_D_65535_1.7021 (in /home/abuild/rpmbuild/BUILD/bluez-5.52/unit/test-mesh-crypto)
==28744==    by 0x400FC12: _dl_fini (in /lib64/ld-2.30.so)
==28744==    by 0x4889876: __run_exit_handlers (in /lib64/libc-2.30.so)
==28744==    by 0x4889A2B: exit (in /lib64/libc-2.30.so)
==28744==    by 0x4871E11: (below main) (in /lib64/libc-2.30.so)
==28744==  Block was alloc'd at
==28744==    at 0x483777F: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28744==    by 0x10C80D: l_malloc (util.c:62)
==28744==    by 0x10B275: UnknownInlinedFun (queue.c:63)
==28744==    by 0x10B275: l_debug_add_section.constprop.0 (log.c:376)
==28744==    by 0x10B308: _sub_I_65535_0.7019 (queue.c:179)
==28744==    by 0x1119E4: __libc_csu_init (elf-init.c:88)
==28744==    by 0x4871D99: (below main) (in /lib64/libc-2.30.so)
==28744==
==28744== Invalid write of size 4
==28744==    at 0x10BCB9: l_queue_clear.constprop.0 (queue.c:116)
==28744==    by 0x10B20F: _sub_D_65535_1.7021 (in /home/abuild/rpmbuild/BUILD/bluez-5.52/unit/test-mesh-crypto)
==28744==    by 0x400FC12: _dl_fini (in /lib64/ld-2.30.so)
==28744==    by 0x4889876: __run_exit_handlers (in /lib64/libc-2.30.so)
==28744==    by 0x4889A2B: exit (in /lib64/libc-2.30.so)
==28744==    by 0x4871E11: (below main) (in /lib64/libc-2.30.so)
==28744==  Address 0x4a15050 is 16 bytes inside a block of size 24 free'd
==28744==    at 0x48389AB: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28744==    by 0x10B207: _sub_D_65535_1.7021 (in /home/abuild/rpmbuild/BUILD/bluez-5.52/unit/test-mesh-crypto)
==28744==    by 0x400FC12: _dl_fini (in /lib64/ld-2.30.so)
==28744==    by 0x4889876: __run_exit_handlers (in /lib64/libc-2.30.so)
==28744==    by 0x4889A2B: exit (in /lib64/libc-2.30.so)
==28744==    by 0x4871E11: (below main) (in /lib64/libc-2.30.so)
==28744==  Block was alloc'd at
==28744==    at 0x483777F: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28744==    by 0x10C80D: l_malloc (util.c:62)
==28744==    by 0x10B275: UnknownInlinedFun (queue.c:63)
==28744==    by 0x10B275: l_debug_add_section.constprop.0 (log.c:376)
==28744==    by 0x10B308: _sub_I_65535_0.7019 (queue.c:179)
==28744==    by 0x1119E4: __libc_csu_init (elf-init.c:88)
==28744==    by 0x4871D99: (below main) (in /lib64/libc-2.30.so)
==28744==
==28744== Invalid free() / delete / delete[] / realloc()
==28744==    at 0x48389AB: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28744==    by 0x400FC12: _dl_fini (in /lib64/ld-2.30.so)
==28744==    by 0x4889876: __run_exit_handlers (in /lib64/libc-2.30.so)
==28744==    by 0x4889A2B: exit (in /lib64/libc-2.30.so)
==28744==    by 0x4871E11: (below main) (in /lib64/libc-2.30.so)
==28744==  Address 0x4a15040 is 0 bytes inside a block of size 24 free'd
==28744==    at 0x48389AB: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28744==    by 0x10B207: _sub_D_65535_1.7021 (in /home/abuild/rpmbuild/BUILD/bluez-5.52/unit/test-mesh-crypto)
==28744==    by 0x400FC12: _dl_fini (in /lib64/ld-2.30.so)
==28744==    by 0x4889876: __run_exit_handlers (in /lib64/libc-2.30.so)
==28744==    by 0x4889A2B: exit (in /lib64/libc-2.30.so)
==28744==    by 0x4871E11: (below main) (in /lib64/libc-2.30.so)
==28744==  Block was alloc'd at
==28744==    at 0x483777F: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28744==    by 0x10C80D: l_malloc (util.c:62)
==28744==    by 0x10B275: UnknownInlinedFun (queue.c:63)
==28744==    by 0x10B275: l_debug_add_section.constprop.0 (log.c:376)
==28744==    by 0x10B308: _sub_I_65535_0.7019 (queue.c:179)
==28744==    by 0x1119E4: __libc_csu_init (elf-init.c:88)
==28744==    by 0x4871D99: (below main) (in /lib64/libc-2.30.so)
==28744==
==28744==
==28744== HEAP SUMMARY:
==28744==     in use at exit: 0 bytes in 0 blocks
==28744==   total heap usage: 773 allocs, 774 frees, 18,598 bytes allocated
==28744==
==28744== All heap blocks were freed -- no leaks are possible
==28744==
==28744== For lists of detected and suppressed errors, rerun with: -s
==28744== ERROR SUMMARY: 5 errors from 5 contexts (suppressed: 0 from 0)


Somehow I'm unable to get proper debuginfo and better output from valgrind and gdb, so in a last desparate attempt I
disabled LTO for my build and now the test does not fail.

I'm going to file a bug for the openSUSE toolchain team for the LTO failure, but it is well possible that they will come
back and point to issues in ell or bluez, so stay tuned ;-)

the difference  LTO vs non-LTO is:

LTO:
CFLAGS='-O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables
-fstack-clash-protection -Werror=return-type  -g'

Non-LTO:
CFLAGS='-O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables
-fstack-clash-protection -Werror=return-type -flto=auto -g'

Best regards,
-- 
Stefan Seyfried

"For a successful technology, reality must take precedence over
 public relations, for nature cannot be fooled." -- Richard Feynman



[Index of Archives]     [Bluez Devel]     [Linux Wireless Networking]     [Linux Wireless Personal Area Networking]     [Linux ATH6KL]     [Linux USB Devel]     [Linux Media Drivers]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux