Re: replication stopped after server restart - problem to reenable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2014-02-12 23:25, Rich Megginson wrote:
On 02/12/2014 02:34 PM, Jan Kowalsky wrote:
Hi Rich,

thank you for answering,

Since this is my first experience with replication I don't know if I do something completely wrong or it's a but. I folled the documentation on https://access.redhat.com/site/documentation/en-US/Red_Hat_Directory_Server/9.0/html/Administration_Guide/Managing_Replication-Configuring-Replication-cmd.html

I've added some more information and logs below.


On 2014-02-12 18:04, Rich Megginson wrote:
On 02/12/2014 06:29 AM, Jan Kowalsky wrote:

Version and platform please - rpm -q 389-ds-base

yes, sorry, I forgot. I'm running on debian wheezy with the packages supplied by http://obs.kolabsys.com:82/Kolab:/3.1/Debian_7.0/:

389-ds-base 1.2.11.15-1 amd64 389 Directory Server suite - server

I've got a single-master replication with one supplier (ldapmaster1) and one consumer (ldapmaster2)

After server restart replication isn't working anymore. The error-log:

[12/Feb/2014:12:04:51 +0100] NSMMReplicationPlugin - Found replication agreement named "cn=test replica,cn=replica,cn=dc\3Ddatenkollektiv\2Cdc\3Dnet,cn=mapping tree,cn=config". [12/Feb/2014:12:04:51 +0100] NSMMReplicationPlugin - The replication agreement named "cn=test replica,cn=replica,cn=dc\3Ddatenkollektiv\2Cdc\3Dnet,cn=mapping tree,cn=config" could not be correctly parsed. No replication will occur with this replica.

Looks like there is a problem with your replication agreement.

Yes, from the error it looks like. But the agreement was ok at least at the time of initializing replication and worked fine until restart.

This is how I added replication agreement:

# Adding Replication Agreement on ldmaster0
/usr/lib/mozldap/ldapmodify -h ldmaster0 -p 389 -D "cn=directory manager" -w SECRETBINDPW << EOF dn: cn=testreplica,cn=replica,cn=dc\=datenkollektiv\,dc\=net,cn=mapping tree,cn=config
changetype: add
objectclass: top
objectclass: nsds5ReplicationAgreement
cn: testreplica
nsds5replicahost: ldslave0
nsds5replicaport: 389
nsds5ReplicaBindDN: cn=replication manager,cn=config
nsds5replicabindmethod: SIMPLE
nsds5replicaroot: dc=test,dc=net
description: agreement between ldmaster0 and ldslave0 for dc=test,dc=net
nsds5replicaupdateschedule: 0001-2359 0123456
nsds5replicatedattributelist: (objectclass=*) $ EXCLUDE authorityRevocationList accountUnlockTime memberof nsDS5ReplicatedAttributeListTotal: (objectclass=*) $ EXCLUDE accountUnlockTime
nsds5replicacredentials: SECRET
EOF

And it looks like this before restarting server:

/usr/lib/mozldap/ldapsearch -x -h ldmaster0 -p 389 -D "cn=directory manager" -w SECRET -s sub -b cn=config "(objectclass=nsds5ReplicationAgreement)"
version: 1
dn: cn=testreplica,cn=replica,cn=dc\3Ddatenkollektiv\2Cdc\3Dnet,cn=mapping tre
e,cn=config
objectClass: top
objectClass: nsds5ReplicationAgreement
cn: testreplica
nsDS5ReplicaHost: ldslave0
nsDS5ReplicaPort: 389
nsDS5ReplicaBindDN: cn=replication manager,cn=config
nsDS5ReplicaBindMethod: SIMPLE
nsDS5ReplicaRoot: dc=test,dc=net
description: agreement between ldmaster0 and ldslave0 for dc=test,dc=net
nsDS5ReplicaUpdateSchedule: 0001-2359 0123456
nsDS5ReplicatedAttributeList: (objectclass=*) $ EXCLUDE authorityRevocationLis
t accountUnlockTime memberof
nsDS5ReplicatedAttributeListTotal: (objectclass=*) $ EXCLUDE accountUnlockTime
nsDS5ReplicaCredentials: {DES}vxy6ibkSEXE+4XrDjA7d5A==
nsds5replicareapactive: 0
nsds5replicaLastUpdateStart: 20140212210355Z
nsds5replicaLastUpdateEnd: 20140212210359Z
nsds5replicaChangesSentSinceStartup:: NzozLzAg
nsds5replicaLastUpdateStatus: 0 Replica acquired successfully: Incremental upd
ate succeeded
nsds5replicaUpdateInProgress: FALSE
nsds5replicaLastInitStart: 20140212205308Z
nsds5replicaLastInitEnd: 20140212205311Z
nsds5replicaLastInitStatus: 0 Total update succeeded


After restart:

dn: cn=testreplica,cn=replica,cn=dc\3Ddatenkollektiv\2Cdc\3Dnet,cn=mapping tre
 e,cn=config
objectClass: top
objectClass: nsds5ReplicationAgreement
cn: testreplica
nsDS5ReplicaHost: ldslave0
nsDS5ReplicaPort: 389
nsDS5ReplicaBindDN: cn=replication manager,cn=config
nsDS5ReplicaBindMethod: SIMPLE
nsDS5ReplicaRoot: dc=test,dc=net
description: agreement between ldmaster0 and ldslave0 for dc=test,dc=net
nsDS5ReplicaUpdateSchedule: 0001-2359 0123456
nsDS5ReplicatedAttributeList: (objectclass=*) $ EXCLUDE authorityRevocationLis
 t accountUnlockTime memberof
nsDS5ReplicatedAttributeListTotal: (objectclass=*) $ EXCLUDE accountUnlockTime
nsDS5ReplicaCredentials: {DES}vxy6ibkSEXE+4XrDjA7d5A==
nsds50ruv: {replicageneration} 52fbded7000000070000
nsds50ruv: {replica 7 ldap://ldmaster0.datenkollektiv.net:389} 52fbdf290000000
 70000 52fbe1ba000000070000
nsruvReplicaLastModified: {replica 7 ldap://ldmaster0.datenkollektiv.net:389}
 00000000
nsds5replicareapactive: 0
nsds5replicaLastUpdateStart: 0
nsds5replicaLastUpdateEnd: 0
nsds5replicaChangesSentSinceStartup:
nsds5replicaLastUpdateStatus: 0 No replication sessions started since server s
 tartup
nsds5replicaUpdateInProgress: FALSE
nsds5replicaLastInitStart: 0
nsds5replicaLastInitEnd: 0

I putted the logs in some gists:


Master log after server restart (replication isn't working anymore)
  https://gist.github.com/jankowa/ddce7791c2f681c54170

Master log before server restart (with working replication)
  https://gist.github.com/jankowa/33dfc8404d5b1f0bf3dc

Slave log before server restart
  https://gist.github.com/jankowa/e21899821aa02a5d8211

All Definitions for replication provided during setup:
  https://gist.github.com/jankowa/4bc116c91c0d2a95e622

I tried again to reproduce in a fresh environment: still the same error, completely reproducable. Same happend in tests with multimaster with two masters.

Any idea appreciated.

The agreements look ok.  Please file a ticket.  Unless someone else

I'll do.

can spot the problem, the only way we are going to get to the bottom
of this is to add some extra debugging to the code (although there is
a lot of debugging there already for the repl log level - must be
something really strange that it doesn't print a message telling us
what went wrong).

How to get some more debugging? I turned log level to replication.

Not sure what version 1.2.11.15-1 is on Debian.  If it is the same as
the upstream 1.2.11.15, that's very old.  Should see if you can get
them to provide 1.2.11.25 or later.

I tried it again with the newest version in debian unstable: 1.3.0.3-1
But the same result:

[13/Feb/2014:09:57:49 +0100] NSMMReplicationPlugin - _replica_update_state: failed to update state of csn generator for replica dc=test,dc=net: LDAP error - 32 [13/Feb/2014:09:57:55 +0100] NSMMReplicationPlugin - Total update aborted: Replication agreement for "agmt="cn=testreplica" (ldslave1:389)" can not be updated while the replica is disabled [13/Feb/2014:09:57:55 +0100] NSMMReplicationPlugin - (If the suffix is disabled you must enable it then restart the server for replication to take place).

Jan
--
389 users mailing list
389-users@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/389-users





[Index of Archives]     [Fedora User Discussion]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [Fedora News]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Maintainers]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Legacy]     [Fedora Desktop]     [Fedora Fonts]     [ATA RAID]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Centos]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora QA]     [Fedora Triage]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Tux]     [Yosemite News]     [Yosemite Photos]     [Linux Apps]     [Maemo Users]     [Gnome Users]     [KDE Users]     [Fedora Tools]     [Fedora Art]     [Fedora Docs]     [Maemo Users]     [Asterisk PBX]     [Fedora Sparc]     [Fedora Universal Network Connector]     [Fedora ARM]

  Powered by Linux