Steffen Blume wrote: > Hi, > > I have tried to setup up multi master replication without success. The > two ldap servers are running fine. Then I execute the mmr.pl script (on b): > ./mmr.pl --host1 a.domain.local --host2 b.domain.local --bindpw secret > --host1_id 1 --host2_id 2 --repmanpw secret --base "dc=domain, dc=local" > --create > > --- error log on a --- > [01/Sep/2010:14:11:39 +0200] NSMMReplicationPlugin - > agmt="cn="Replication to b.domain.local"" (b:389): Replica has a > different generation ID than the local data. > [01/Sep/2010:14:11:42 +0200] NSMMReplicationPlugin - Beginning total > update of replica "agmt="cn="Replication to b.domain.local"" (b:389)". > [01/Sep/2010:14:11:47 +0200] NSMMReplicationPlugin - Finished total > update of replica "agmt="cn="Replication to b.domain.local"" (b:389)". > Sent 1375 entries. > -------------------- > > --- error log on b --- > [01/Sep/2010:14:11:39 +0200] NSMMReplicationPlugin - > agmt="cn="Replication to a.domain.local"" (a:389): Replica has a > different generation ID than the local data. > [01/Sep/2010:14:11:40 +0200] NSMMReplicationPlugin - > repl_set_mtn_referrals: could not set referrals for replica > dc=domain,dc=local: 32 > [01/Sep/2010:14:11:40 +0200] NSMMReplicationPlugin - > multimaster_be_state_change: replica dc=domain,dc=local is going > offline; disabling replication > [01/Sep/2010:14:11:41 +0200] - somehow, there are still 200 entries in > the entry cache. :/ > [01/Sep/2010:14:11:42 +0200] - WARNING: Import is running with > nsslapd-db-private-import-mem on; No other process is allowed to access > the database > [01/Sep/2010:14:11:46 +0200] - import userRoot: Workers finished; > cleaning up... > [01/Sep/2010:14:11:46 +0200] - import userRoot: Workers cleaned up. > [01/Sep/2010:14:11:46 +0200] - import userRoot: Indexing complete. > Post-processing... > [01/Sep/2010:14:11:46 +0200] - import userRoot: Flushing caches... > [01/Sep/2010:14:11:46 +0200] - import userRoot: Closing files... > [01/Sep/2010:14:11:46 +0200] - somehow, there are still 200 entries in > the entry cache. :/ > [01/Sep/2010:14:11:47 +0200] - import userRoot: Import complete. > Processed 1375 entries in 5 seconds. (275.00 entries/sec) > [01/Sep/2010:14:11:47 +0200] NSMMReplicationPlugin - > multimaster_be_state_change: replica dc=domain,dc=local is coming > online; enabling replication > [01/Sep/2010:14:11:47 +0200] NSMMReplicationPlugin - > _replica_configure_ruv: failed to create replica ruv tombstone entry > (dc=domain, dc=local); LDAP error - 68 > This means the RUV entry or some other MMR state information was left over from a previous configuration attempt. Err=68 is Already Exists - the entry already exists. Since this fails, nothing else is going to work. > [01/Sep/2010:14:11:47 +0200] NSMMReplicationPlugin - > replica_enable_replication: reloading ruv failed > [01/Sep/2010:14:11:49 +0200] NSMMReplicationPlugin - > _replica_configure_ruv: failed to create replica ruv tombstone entry > (dc=domain, dc=local); LDAP error - 68 > [01/Sep/2010:14:12:19 +0200] NSMMReplicationPlugin - > _replica_configure_ruv: failed to create replica ruv tombstone entry > (dc=domain, dc=local); LDAP error - 68 > [01/Sep/2010:14:12:49 +0200] NSMMReplicationPlugin - > _replica_configure_ruv: failed to create replica ruv tombstone entry > (dc=domain, dc=local); LDAP error - 68 > [01/Sep/2010:14:13:19 +0200] NSMMReplicationPlugin - > _replica_configure_ruv: failed to create replica ruv tombstone entry > (dc=domain, dc=local); LDAP error - 68 > [01/Sep/2010:14:13:49 +0200] NSMMReplicationPlugin - > _replica_configure_ruv: failed to create replica ruv tombstone entry > (dc=domain, dc=local); LDAP error - 68 > -------------------- > > So what do the errors "repl_set_mtn_referrals: could not set referrals" > and "_replica_configure_ruv: failed to create replica ruv tombstone > entry" mean? > > The messages on b stop, when I restart the ldap server. But the > replication is not working. Since MMR setup failed, no MMR is going to work. > On the first replication setup not all the > data was copied. I removed the replication configuration with mmr.pl I think this is the problem. Either mmr.pl does not cleanly remove the replication configuration, or there is a bug in the server. For example, see https://bugzilla.redhat.com/show_bug.cgi?id=624442 > and > set it up again with same error messages. > When I change something (in uid=sbl,ou=people,...) on a the error log of > a shows > --- error log on a --- > [01/Sep/2010:14:35:20 +0200] NSMMReplicationPlugin - > agmt="cn="Replication to b.domain.local"" (b:389): Replica has a > different generation ID than the local data. > [01/Sep/2010:14:35:24 +0200] NSMMReplicationPlugin - > agmt="cn="Replication to b.domain.local"" (b:389): Replica has a > different generation ID than the local data. > [01/Sep/2010:14:35:28 +0200] NSMMReplicationPlugin - > agmt="cn="Replication to b.domain.local"" (b:389): Replica has a > different generation ID than the local data. > ... > -------------------- > This means the consumer was not initialized properly. > Nothing in error log on b. But in access log: > > --- acces log on b --- > [01/Sep/2010:14:35:20 +0200] conn=0 op=3 SRCH base="ou=People, > dc=domain, dc=local" scope=1 filter="(objectClass=*)" attrs="objectClass" > [01/Sep/2010:14:35:20 +0200] conn=0 op=7 EXT > oid="2.16.840.1.113730.3.5.3" name="Netscape Replication Start Session" > [01/Sep/2010:14:35:20 +0200] conn=0 op=7 RESULT err=0 tag=120 nentries=0 > etime=0 > [01/Sep/2010:14:35:20 +0200] conn=0 op=8 EXT > oid="2.16.840.1.113730.3.5.5" name="Netscape Replication End Session" > [01/Sep/2010:14:35:20 +0200] conn=0 op=8 RESULT err=0 tag=120 nentries=0 > etime=0 > [01/Sep/2010:14:35:20 +0200] conn=0 op=3 RESULT err=0 tag=101 > nentries=100 etime=0 notes=U > [01/Sep/2010:14:35:20 +0200] conn=0 op=4 SRCH base="ou=People, > dc=domain, dc=local" scope=1 filter="(objectClass=*)" attrs="objectClass" > [01/Sep/2010:14:35:20 +0200] conn=0 op=4 RESULT err=0 tag=101 > nentries=82 etime=0 > [01/Sep/2010:14:35:24 +0200] conn=0 op=10 EXT > oid="2.16.840.1.113730.3.5.3" name="Netscape Replication Start Session" > [01/Sep/2010:14:35:24 +0200] conn=0 op=10 RESULT err=0 tag=120 > nentries=0 etime=0 > [01/Sep/2010:14:35:24 +0200] conn=0 op=11 EXT > oid="2.16.840.1.113730.3.5.5" name="Netscape Replication End Session" > [01/Sep/2010:14:35:24 +0200] conn=0 op=11 RESULT err=0 tag=120 > nentries=0 etime=0 > [01/Sep/2010:14:35:25 +0200] conn=0 op=5 SRCH > base="uid=sbl,ou=People,dc=domain,dc=local" scope=0 > filter="(objectClass=*)" attrs=ALL > [01/Sep/2010:14:35:25 +0200] conn=0 op=5 RESULT err=0 tag=101 nentries=1 > etime=0 > [01/Sep/2010:14:35:27 +0200] conn=0 op=12 EXT > oid="2.16.840.1.113730.3.5.3" name="Netscape Replication Start Session" > [01/Sep/2010:14:35:27 +0200] conn=0 op=12 RESULT err=0 tag=120 > nentries=0 etime=0 > [01/Sep/2010:14:35:27 +0200] conn=0 op=13 EXT > oid="2.16.840.1.113730.3.5.5" name="Netscape Replication End Session" > [01/Sep/2010:14:35:27 +0200] conn=0 op=13 RESULT err=0 tag=120 > nentries=0 etime=0 > ... > -------------------- > > Both 389 DS versions are 1.2.4. I compiled it myself for OpenSolaris > (SunOS 5.11 snv_111b) > Try 1.2.6. There have been many, many bug fixes between 1.2.4 and 1.2.6. > Regards, > Steffen > >