Re: Can't locate CSN in Multi-Master replica

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dael Maselli wrote:
As I said this is a test for a big central LDAP server and before starting from
scratch I would like to know what's gone wrong.
There are odd problems from time to time when deleting/recreating replication agreements, replica config, and changelog config. That's why it's better to start from scratch.

I enabled the replica logs and this is the result, note that ds-m1 is node A and ds-m4 is node B, the others ds-m2 and ds-m3 where in the 4-way test. How I can delete them from the configuration???

--- Node A ---
[05/Nov/2007:11:53:32 +0100] NSMMReplicationPlugin - : Update window will close at Tue Nov 6 00:01:00 2007 [05/Nov/2007:11:53:32 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): State: wait_for_changes -> wait_for_changes [05/Nov/2007:11:53:32 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): State: wait_for_changes -> start [05/Nov/2007:11:53:32 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): No linger to cancel on the connection [05/Nov/2007:11:53:32 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): Disconnected from the consumer [05/Nov/2007:11:53:32 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): State: start -> ready_to_acquire_replica [05/Nov/2007:11:53:32 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): Trying secure slapi_ldap_init [05/Nov/2007:11:53:32 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): binddn = , passwd = [05/Nov/2007:11:53:32 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): No linger to cancel on the connection [05/Nov/2007:11:53:32 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): Replica was successfully acquired. [05/Nov/2007:11:53:32 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): State: ready_to_acquire_replica -> sending_updates [05/Nov/2007:11:53:32 +0100] - _cl5PositionCursorForReplay (agmt="cn=ds-m4.infn.it" (ds-m4:636)): Consumer RUV: [05/Nov/2007:11:53:32 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): {replicageneration} 471e1779000000010000 [05/Nov/2007:11:53:32 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): {replica 4 ldap://ds-m4.infn.it:389} 471f8bb5000000040000 4721e4e7000000040000 00000000 [05/Nov/2007:11:53:32 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): {replica 1 ldap://ds-m1.infn.it:389} 471e185e000000010000 47220f21000000010000 00000000 [05/Nov/2007:11:53:32 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): {replica 2 ldap://ds-m2.infn.it:389} 471e1834000000020000 47220a40000000020000 00000000 [05/Nov/2007:11:53:32 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): {replica 3 ldap://ds-m3.infn.it:389} 4721e230000000030000 4721e5c6000000030000 00000000 [05/Nov/2007:11:53:32 +0100] - _cl5PositionCursorForReplay (agmt="cn=ds-m4.infn.it" (ds-m4:636)): Supplier RUV: [05/Nov/2007:11:53:32 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): {replicageneration} 471e1779000000010000 [05/Nov/2007:11:53:32 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): {replica 1 ldap://ds-m1.infn.it:389} 471e185e000000010000 4725e80f000000010000 4725e80f [05/Nov/2007:11:53:32 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): {replica 2 ldap://ds-m2.infn.it:389} 471e1834000000020000 47220a40000000020000 00000000 [05/Nov/2007:11:53:32 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): {replica 3 ldap://ds-m3.infn.it:389} 4721e230000000030000 4721e5c6000000030000 00000000 [05/Nov/2007:11:53:32 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): {replica 4 ldap://ds-m4.infn.it:389} 471f8bb5000000040000 4721e4e7000000040000 00000000 [05/Nov/2007:11:53:32 +0100] agmt="cn=ds-m4.infn.it" (ds-m4:636) - session start: anchorcsn=47220f21000000010000 [05/Nov/2007:11:53:32 +0100] agmt="cn=ds-m4.infn.it" (ds-m4:636) - Can't locate CSN 47220f21000000010000 in the changelog (DB rc=-30990). The consumer may need to be reinitialized. [05/Nov/2007:11:53:32 +0100] agmt="cn=ds-m4.infn.it" (ds-m4:636) - clcache_load_buffer: rc=-30990 [05/Nov/2007:11:53:32 +0100] NSMMReplicationPlugin - changelog program - agmt="cn=ds-m4.infn.it" (ds-m4:636): CSN 47220f21000000010000 found, position set for replay [05/Nov/2007:11:53:32 +0100] agmt="cn=ds-m4.infn.it" (ds-m4:636) - clcache_load_buffer: rc=-30990 [05/Nov/2007:11:53:32 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): No more updates to send (cl5GetNextOperationToReplay)
[05/Nov/2007:11:53:32 +0100] - repl5_inc_waitfor_async_results: 0 0
[05/Nov/2007:11:53:32 +0100] - repl5_inc_result_threadmain starting
[05/Nov/2007:11:53:33 +0100] - repl5_inc_result_threadmain exiting
[05/Nov/2007:11:53:33 +0100] agmt="cn=ds-m4.infn.it" (ds-m4:636) - session end: state=0 load=0 sent=0 skipped=0 [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): Successfully released consumer [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): Beginning linger on the connection [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): State: sending_updates -> wait_for_changes [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): State: wait_for_changes -> start [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): Cancelling linger on the connection [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): Disconnected from the consumer [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): State: start -> ready_to_acquire_replica [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): Trying secure slapi_ldap_init [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): binddn = , passwd = [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): No linger to cancel on the connection [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): Replica was successfully acquired. [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): State: ready_to_acquire_replica -> sending_updates [05/Nov/2007:11:53:33 +0100] - _cl5PositionCursorForReplay (agmt="cn=ds-m4.infn.it" (ds-m4:636)): Consumer RUV: [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): {replicageneration} 471e1779000000010000 [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): {replica 4 ldap://ds-m4.infn.it:389} 471f8bb5000000040000 4721e4e7000000040000 00000000 [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): {replica 1 ldap://ds-m1.infn.it:389} 471e185e000000010000 47220f21000000010000 00000000 [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): {replica 2 ldap://ds-m2.infn.it:389} 471e1834000000020000 47220a40000000020000 00000000 [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): {replica 3 ldap://ds-m3.infn.it:389} 4721e230000000030000 4721e5c6000000030000 00000000 [05/Nov/2007:11:53:33 +0100] - _cl5PositionCursorForReplay (agmt="cn=ds-m4.infn.it" (ds-m4:636)): Supplier RUV: [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): {replicageneration} 471e1779000000010000 [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): {replica 1 ldap://ds-m1.infn.it:389} 471e185e000000010000 4725e80f000000010000 4725e80f [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): {replica 2 ldap://ds-m2.infn.it:389} 471e1834000000020000 47220a40000000020000 00000000 [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): {replica 3 ldap://ds-m3.infn.it:389} 4721e230000000030000 4721e5c6000000030000 00000000 [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): {replica 4 ldap://ds-m4.infn.it:389} 471f8bb5000000040000 4721e4e7000000040000 00000000 [05/Nov/2007:11:53:33 +0100] agmt="cn=ds-m4.infn.it" (ds-m4:636) - session start: anchorcsn=47220f21000000010000 [05/Nov/2007:11:53:33 +0100] agmt="cn=ds-m4.infn.it" (ds-m4:636) - Can't locate CSN 47220f21000000010000 in the changelog (DB rc=-30990). The consumer may need to be reinitialized. [05/Nov/2007:11:53:33 +0100] agmt="cn=ds-m4.infn.it" (ds-m4:636) - clcache_load_buffer: rc=-30990 [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - changelog program - agmt="cn=ds-m4.infn.it" (ds-m4:636): CSN 47220f21000000010000 found, position set for replay [05/Nov/2007:11:53:33 +0100] agmt="cn=ds-m4.infn.it" (ds-m4:636) - clcache_load_buffer: rc=-30990 [05/Nov/2007:11:53:33 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): No more updates to send (cl5GetNextOperationToReplay)
[05/Nov/2007:11:53:33 +0100] - repl5_inc_waitfor_async_results: 0 0
[05/Nov/2007:11:53:33 +0100] - repl5_inc_result_threadmain starting
[05/Nov/2007:11:53:34 +0100] - repl5_inc_result_threadmain exiting
[05/Nov/2007:11:53:34 +0100] agmt="cn=ds-m4.infn.it" (ds-m4:636) - session end: state=0 load=0 sent=0 skipped=0 [05/Nov/2007:11:53:34 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): Successfully released consumer [05/Nov/2007:11:53:34 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): Beginning linger on the connection [05/Nov/2007:11:53:34 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): State: sending_updates -> wait_for_changes [05/Nov/2007:11:53:32 +0100] conn=0 op=106 SRCH base="cn=replication,cn=config" scope=2 filter="(objectClass=*)" attrs=ALL [05/Nov/2007:11:53:32 +0100] conn=0 op=106 RESULT err=0 tag=101 nentries=1 etime=0 [05/Nov/2007:11:53:32 +0100] conn=0 op=107 MOD dn="cn=ds-m4.infn.it, cn=replica, cn=\22dc=infn,dc=it\22, cn=mapping tree, cn=config" [05/Nov/2007:11:53:32 +0100] conn=0 op=107 RESULT err=0 tag=103 nentries=0 etime=0 [05/Nov/2007:11:53:32 +0100] conn=0 op=108 MOD dn="cn=ds-m4.infn.it, cn=replica, cn=\22dc=infn,dc=it\22, cn=mapping tree, cn=config" [05/Nov/2007:11:53:32 +0100] conn=0 op=108 RESULT err=0 tag=103 nentries=0 etime=0 [05/Nov/2007:11:54:35 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): Linger timeout has expired on the connection [05/Nov/2007:11:54:35 +0100] NSMMReplicationPlugin - agmt="cn=ds-m4.infn.it" (ds-m4:636): Disconnected from the consumer

--- Node B ---
[05/Nov/2007:11:55:14 +0100] NSMMReplicationPlugin - conn=1967 op=3 repl="dc=infn,dc=it": Begin incremental protocol [05/Nov/2007:11:55:14 +0100] NSMMReplicationPlugin - conn=1967 op=3 repl="dc=infn,dc=it": Acquired replica [05/Nov/2007:11:55:14 +0100] NSMMReplicationPlugin - conn=1967 op=3 repl="dc=infn,dc=it": StartNSDS50ReplicationRequest: response=0 rc=0 [05/Nov/2007:11:55:15 +0100] NSMMReplicationPlugin - conn=1967 op=4 repl="dc=infn,dc=it": Released replica [05/Nov/2007:11:55:16 +0100] NSMMReplicationPlugin - conn=1968 op=3 repl="dc=infn,dc=it": Begin incremental protocol [05/Nov/2007:11:55:16 +0100] NSMMReplicationPlugin - conn=1968 op=3 repl="dc=infn,dc=it": Acquired replica [05/Nov/2007:11:55:16 +0100] NSMMReplicationPlugin - conn=1968 op=3 repl="dc=infn,dc=it": StartNSDS50ReplicationRequest: response=0 rc=0 [05/Nov/2007:11:55:17 +0100] NSMMReplicationPlugin - conn=1968 op=4 repl="dc=infn,dc=it": Released replica [05/Nov/2007:11:55:14 +0100] conn=1967 fd=65 slot=65 SSL connection from 193.206.153.171 to 193.206.144.35 [05/Nov/2007:11:55:14 +0100] conn=1967 SSL 256-bit AES; client CN=ds-m1.infn.it,L=Lecce,OU=Host,O=INFN,C=IT; issuer CN=INFN CA,O=INFN,C=IT [05/Nov/2007:11:55:14 +0100] conn=1967 SSL client bound as cn=ds-m1.infn.it,cn=config [05/Nov/2007:11:55:14 +0100] conn=1967 op=0 BIND dn="" method=sasl version=3 mech=EXTERNAL [05/Nov/2007:11:55:14 +0100] conn=1967 op=0 RESULT err=0 tag=97 nentries=0 etime=0 dn="cn=ds-m1.infn.it,cn=config" [05/Nov/2007:11:55:14 +0100] conn=1967 op=1 SRCH base="" scope=0 filter="(objectClass=*)" attrs="supportedControl supportedExtension" [05/Nov/2007:11:55:14 +0100] conn=1967 op=1 RESULT err=0 tag=101 nentries=1 etime=0 [05/Nov/2007:11:55:14 +0100] conn=1967 op=2 SRCH base="" scope=0 filter="(objectClass=*)" attrs="supportedControl supportedExtension" [05/Nov/2007:11:55:14 +0100] conn=1967 op=2 RESULT err=0 tag=101 nentries=1 etime=0 [05/Nov/2007:11:55:14 +0100] conn=1967 op=3 EXT oid="2.16.840.1.113730.3.5.3" name="Netscape Replication Start Session" [05/Nov/2007:11:55:14 +0100] conn=1967 op=3 RESULT err=0 tag=120 nentries=0 etime=0 [05/Nov/2007:11:55:15 +0100] conn=1967 op=4 EXT oid="2.16.840.1.113730.3.5.5" name="Netscape Replication End Session" [05/Nov/2007:11:55:15 +0100] conn=1967 op=4 RESULT err=0 tag=120 nentries=0 etime=0
[05/Nov/2007:11:55:15 +0100] conn=1967 op=5 UNBIND
[05/Nov/2007:11:55:15 +0100] conn=1967 op=5 fd=65 closed - U1
[05/Nov/2007:11:55:15 +0100] conn=1968 fd=66 slot=66 SSL connection from 193.206.153.171 to 193.206.144.35 [05/Nov/2007:11:55:15 +0100] conn=1968 SSL 256-bit AES; client CN=ds-m1.infn.it,L=Lecce,OU=Host,O=INFN,C=IT; issuer CN=INFN CA,O=INFN,C=IT [05/Nov/2007:11:55:15 +0100] conn=1968 SSL client bound as cn=ds-m1.infn.it,cn=config [05/Nov/2007:11:55:15 +0100] conn=1968 op=0 BIND dn="" method=sasl version=3 mech=EXTERNAL [05/Nov/2007:11:55:15 +0100] conn=1968 op=0 RESULT err=0 tag=97 nentries=0 etime=0 dn="cn=ds-m1.infn.it,cn=config" [05/Nov/2007:11:55:15 +0100] conn=1968 op=1 SRCH base="" scope=0 filter="(objectClass=*)" attrs="supportedControl supportedExtension" [05/Nov/2007:11:55:15 +0100] conn=1968 op=1 RESULT err=0 tag=101 nentries=1 etime=0 [05/Nov/2007:11:55:15 +0100] conn=1968 op=2 SRCH base="" scope=0 filter="(objectClass=*)" attrs="supportedControl supportedExtension" [05/Nov/2007:11:55:15 +0100] conn=1968 op=2 RESULT err=0 tag=101 nentries=1 etime=0 [05/Nov/2007:11:55:16 +0100] conn=1968 op=3 EXT oid="2.16.840.1.113730.3.5.3" name="Netscape Replication Start Session" [05/Nov/2007:11:55:16 +0100] conn=1968 op=3 RESULT err=0 tag=120 nentries=0 etime=0 [05/Nov/2007:11:55:17 +0100] conn=1968 op=4 EXT oid="2.16.840.1.113730.3.5.5" name="Netscape Replication End Session" [05/Nov/2007:11:55:17 +0100] conn=1968 op=4 RESULT err=0 tag=120 nentries=0 etime=0
[05/Nov/2007:11:56:17 +0100] conn=1968 op=5 UNBIND
[05/Nov/2007:11:56:17 +0100] conn=1968 op=5 fd=66 closed - U1

Thank you.

Richard Megginson wrote:
Dael Maselli wrote:

Richard Megginson, on 31/10/2007 17.43, wrote:
Dael Maselli wrote:
[...]
"SSL Client Authentication". Here I had a problem! There was a pop-up that told me it can't connect to the other fds server, but I thought it was a bug, because I checked with tcpdump and saw no packet sent (I can see it with simple auth). So I clicked to continue and all seems to work well, even the initialization done from A to B, I didn't
do it when I created the Agreement from B to A in the same way.
You don't need to initialize from B to A if you already did the initialize from A to B.

Yes, I never did it. I only did A->B.


When you did the tcpdump, did you look at traffic on port 389 too, or just 636?

I looked at 389 when I used simple auth with UNencrypted connection,
and I saw packets. When I do SSL Auth I specify port 636 for the destination
of the agreement, so I didn't look at 389. At 636 no packets.

I tried with SSL and 389 hoping in TLS but it didn't work.
I suggest turning up the error log level to the replication log, then attempt to initialize B from A. You may have to enable replication logging on both A and B - see http://directory.fedoraproject.org/wiki/FAQ#Troubleshooting

By the way, in production environment I need to do the 4-way MMR, in the manual I read to do it with the A agreement to B and D, B to A and C, and so on, in a circular manner. I don't like this way due to its split-brain danger and no ollerance to more than 1 server fault, so I first tried connecting all
to all, is it wrong?
No.
May it be the cause of the CNS disaster?
I don't think so.

I note you that after this 4-way test i deleted all agreements, replicas and
changelogs, maybe there is some "dirty" configuration?
Ah, yes, that could be.  Can you start over again from scratch?

Thanks.


------------------------------------------------------------------------

--
Fedora-directory-users mailing list
Fedora-directory-users@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/fedora-directory-users

------------------------------------------------------------------------

--
Fedora-directory-users mailing list
Fedora-directory-users@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/fedora-directory-users


------------------------------------------------------------------------

--
Fedora-directory-users mailing list
Fedora-directory-users@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/fedora-directory-users

------------------------------------------------------------------------

--
Fedora-directory-users mailing list
Fedora-directory-users@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/fedora-directory-users

<<attachment: smime.p7s>>

--
Fedora-directory-users mailing list
Fedora-directory-users@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/fedora-directory-users

[Index of Archives]     [Fedora Directory Users]     [Fedora Directory Devel]     [Fedora Announce]     [Fedora Legacy Announce]     [Kernel]     [Fedora Legacy]     [Share Photos]     [Fedora Desktop]     [PAM]     [Red Hat Watch]     [Red Hat Development]     [Big List of Linux Books]     [Gimp]     [Yosemite News]

  Powered by Linux