On 02/05/2010 08:32 AM, Francesco Fiore wrote: > > > Francesco Fiore wrote: >> >> >> Rich Megginson wrote: >>> Francesco Fiore wrote: >>> >>>> Hi, >>>> I've two directory server in multimaster configuration. I've to >>>> reinitialize all databases on 2 nd server (B) using the data of the 1st (A). >>>> After the synchronization, server B crash with an segmentation fault. >>>> There isn't any relevant message in the error log. >>>> If I restart the directory server B, I've the same error. >>>> The directory server version is 1.1.3 on Redhat5. >>>> >>>> >>> rpm -qi fedora-ds-base >>> >>> 32-bit or 64-bit? >>> >>> We have fixed quite a few replication bugs since 1.1.3, including a >>> couple of crashes. I recommend upgrading to the latest. >>> >> # rpm -qi 389-ds-base >> Name : 389-ds-base Relocations: (not relocatable) >> Version : 1.2.4 Vendor: Fedora Project >> Release : 1.el5 Build Date: Tue 03 Nov >> 2009 04:47:39 PM CET >> Install Date: Fri 05 Feb 2010 11:49:11 AM CET Build Host: >> x86-6.fedora.phx.redhat.com >> Group : System Environment/Daemons Source RPM: >> 389-ds-base-1.2.4-1.el5.src.rpm >> Size : 5339258 License: GPLv2 with >> exceptions >> Signature : DSA/SHA1, Fri 06 Nov 2009 05:17:38 PM CET, Key ID >> 119cc036217521f6 >> Packager : Fedora Project >> URL : http://port389.org/ >> Summary : 389 Directory Server (base) >> Description : >> >> x86-64 >> >> I updated to the last stable version but I've the same error. >> I traced the running process and I discovered that the segmentation >> fault is probably caused by futex system call. I attach the tail of >> the output of the strace command below. >> >> getpeername(6, 0x7fff8256e3a0, [1475252821577171056]) = -1 ENOTCONN >> (Transport endpoint is not connected) >> poll([{fd=42, events=POLLIN}, {fd=-1}, {fd=6, events=POLLIN}, >> {fd=-1}, {fd=65, events=POLLIN}], 5, 250) = 1 ([{fd=65, revents=POLLIN}]) >> futex(0x145f806c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x145f8068, >> {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 >> futex(0x145d0850, FUTEX_WAKE_PRIVATE, 1) = 1 >> getpeername(6, 0x7fff8256e3a0, [1475252821577171056]) = -1 ENOTCONN >> (Transport endpoint is not connected) >> poll([{fd=42, events=POLLIN}, {fd=-1}, {fd=6, events=POLLIN}, >> {fd=-1}], 4, 250) = 1 ([{fd=42, revents=POLLIN}]) >> read(42, "\0", 200) = 1 >> getpeername(6, 0x7fff8256e3a0, [1475252821577171056]) = -1 ENOTCONN >> (Transport endpoint is not connected) >> poll([{fd=42, events=POLLIN}, {fd=-1}, {fd=6, events=POLLIN}, >> {fd=-1}, {fd=64, events=POLLIN}], 5, 250) = 1 ([{fd=64, revents=POLLIN}]) >> futex(0x145f806c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x145f8068, >> {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 >> futex(0x14550730, FUTEX_WAKE_PRIVATE, 1 <unavailable ...> >> getpeername(6, 0x7fff8256e3a0, [1475252821577171056]) = -1 ENOTCONN >> (Transport endpoint is not connected) >> poll([{fd=42, events=POLLIN}, {fd=-1}, {fd=6, events=POLLIN}, >> {fd=-1}, {fd=65, events=POLLIN}], 5, 250) = 1 ([{fd=65, revents=POLLIN}]) >> futex(0x145f806c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x145f8068, >> {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 >> futex(0x145d0850, FUTEX_WAKE_PRIVATE, 1) = 1 >> getpeername(6, 0x7fff8256e3a0, [1475252821577171056]) = -1 ENOTCONN >> (Transport endpoint is not connected) >> poll([{fd=42, events=POLLIN}, {fd=-1}, {fd=6, events=POLLIN}, >> {fd=-1}], 4, 250) = 1 ([{fd=42, revents=POLLIN}]) >> read(42, "\0", 200) = 1 >> getpeername(6, 0x7fff8256e3a0, [1475252821577171056]) = -1 ENOTCONN >> (Transport endpoint is not connected) >> poll([{fd=42, events=POLLIN}, {fd=-1}, {fd=6, events=POLLIN}, >> {fd=-1}, {fd=64, events=POLLIN}], 5, 250) = 1 ([{fd=64, revents=POLLIN}]) >> futex(0x145f806c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x145f8068, >> {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 >> futex(0x14550730, FUTEX_WAKE_PRIVATE, 1 <unavailable ...> > > I debugged the running process and gdb printed this stacktrace after > the segmentation fault: > > Program received signal SIGSEGV, Segmentation fault. > [Switching to Thread 0x63b2b940 (LWP 31976)] > 0x000000364fa79140 in strcmp () from /lib64/libc.so.6 > (gdb) bt > #0 0x000000364fa79140 in strcmp () from /lib64/libc.so.6 > #1 0x00002b188041e4fc in ?? () from > /usr/lib64/dirsrv/plugins/libback-ldbm.so > #2 0x00002b188041d8d9 in add_hash () from > /usr/lib64/dirsrv/plugins/libback-ldbm.so > #3 0x00002b188041df27 in ?? () from > /usr/lib64/dirsrv/plugins/libback-ldbm.so > #4 0x00002b188042c273 in id2entry () from > /usr/lib64/dirsrv/plugins/libback-ldbm.so > #5 0x00002b18804594c0 in uniqueid2entry () from > /usr/lib64/dirsrv/plugins/libback-ldbm.so > #6 0x00002b188042b961 in ?? () from > /usr/lib64/dirsrv/plugins/libback-ldbm.so > #7 0x00002b18804445fc in ldbm_back_delete () from > /usr/lib64/dirsrv/plugins/libback-ldbm.so > #8 0x00002b187c4990d4 in ?? () from /usr/lib64/dirsrv/libslapd.so.0 > #9 0x00002b187c499413 in do_delete () from /usr/lib64/dirsrv/libslapd.so.0 > #10 0x0000000000412e79 in sasl_map_config_add () > #11 0x0000003590827fad in ?? () from /usr/lib64/libnspr4.so > #12 0x00000036506064a7 in start_thread () from /lib64/libpthread.so.0 > #13 0x000000364fad3c2d in clone () from /lib64/libc.so.6 > > I hope that these information can be useful. The stacktrace is really useful. Thanks! If possible, could you install the debuginfo package and take the stacktrace? yum install 389-ds-base-debuginfo --noriko >> >>>> I attach the tails of the error log and the /var/log/messages log. >>>> >>>> [03/Feb/2010:19:20:53 +0100] - import Addressbook2: Workers finished; >>>> cleaning up... >>>> [03/Feb/2010:19:21:13 +0100] - import Addressbook1: Workers finished; >>>> cleaning up... >>>> [03/Feb/2010:19:21:13 +0100] - import Addressbook2: Workers cleaned up. >>>> [03/Feb/2010:19:21:13 +0100] - import Addressbook2: Indexing complete. >>>> Post-processing... >>>> [03/Feb/2010:19:21:13 +0100] - import Addressbook1: Workers cleaned up. >>>> [03/Feb/2010:19:21:13 +0100] - import Addressbook1: Indexing complete. >>>> Post-processing... >>>> [03/Feb/2010:19:21:50 +0100] - import Addressbook2: Flushing caches... >>>> [03/Feb/2010:19:22:27 +0100] - import Addressbook1: Flushing caches... >>>> [03/Feb/2010:19:22:27 +0100] - import Addressbook2: Closing files... >>>> [03/Feb/2010:19:22:27 +0100] - import Addressbook1: Closing files... >>>> [03/Feb/2010:19:32:27 +0100] - import Addressbook2: Import complete. >>>> Processed 3820687 entries in 4957 seconds. (770.77 entries/sec) >>>> [03/Feb/2010:19:32:28 +0100] NSMMReplicationPlugin - >>>> multimaster_be_state_change: replica o=addressbook2 is coming online; >>>> enabling replication >>>> [03/Feb/2010:19:32:29 +0100] - import Addressbook1: Import complete. >>>> Processed 3820339 entries in 4960 seconds. (770.23 entries/sec) >>>> [03/Feb/2010:19:32:29 +0100] NSMMReplicationPlugin - >>>> multimaster_be_state_change: replica o=addressbook1 is coming online; >>>> enabling replication >>>> [03/Feb/2010:19:32:29 +0100] NSMMReplicationPlugin - replica_reload_ruv: >>>> Warning: new data for replica o=addressbook1 does not match the data in >>>> the changelog. >>>> Recreating the changelog file. This could affect replication with >>>> replica's consumers in which case the consumers should be reinitialized. >>>> >>>> Feb 3 19:32:35 mmt-l-al19 kernel: ns-slapd[5575]: segfault at >>>> 0000000000000000 rip 000000364fa79140 rsp 0000000056bd3b18 error 4 >>>> >>>> Have you any idea? >>>> >>>> Thanks >>>> >>>> >>>> >>> >>> -- >>> 389 users mailing list >>> 389-users at lists.fedoraproject.org >>> https://admin.fedoraproject.org/mailman/listinfo/389-users >>> >> >> -- >> Francesco Fiore >> System Integrator >> Babel S.r.l. -http://www.babel.it >> P.zza S.Benedetto da Norcia, 33 - 00040 Pomezia (Roma) >> >> >> CONFIDENZIALE: Questo messaggio ed i suoi allegati sono di carattere >> confidenziale per i destinatari in indirizzo. Se hai ricevuto questo >> messaggio per errore sei invitato cortesemente a rispondere >> immediatamente al mittente e cancellare tutti i suoi contenuti. >> >> ------------------------------------------------------------------------ >> >> -- >> 389 users mailing list >> 389-users at lists.fedoraproject.org >> https://admin.fedoraproject.org/mailman/listinfo/389-users > Thanks > -- > Francesco Fiore > System Integrator > Babel S.r.l. -http://www.babel.it > P.zza S.Benedetto da Norcia, 33 - 00040 Pomezia (Roma) > > > CONFIDENZIALE: Questo messaggio ed i suoi allegati sono di carattere > confidenziale per i destinatari in indirizzo. Se hai ricevuto questo > messaggio per errore sei invitato cortesemente a rispondere > immediatamente al mittente e cancellare tutti i suoi contenuti. > > > > -- > 389 users mailing list > 389-users at lists.fedoraproject.org > https://admin.fedoraproject.org/mailman/listinfo/389-users -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.fedoraproject.org/pipermail/389-users/attachments/20100205/09610076/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 6646 bytes Desc: S/MIME Cryptographic Signature Url : http://lists.fedoraproject.org/pipermail/389-users/attachments/20100205/09610076/attachment-0001.bin