Re: nsds5replicaLastInitStatus: -2 Total update abortedSystem error

Rich Megginson <rmeggins@xxxxxxxxxx> · Mon, 25 Nov 2013 07:18:56 -0700

On 11/23/2013 08:11 AM, Graham Leggett wrote:
Hi all,

I have two LDAP servers in a multimaster replication setup that has worked fine for a while.

Recently it was reported to me that the two LDAP servers had somehow gone out of sync and refused to replicate. I am trying to fix this by triggering an initialisation from what I've chosen to be authoritative source of data to the other using the instructions here: http://www.centos.org/docs/5/html/CDS/ag/8.0/Managing_Replication-Configuring-Replication-cmd.html#Configuring-Replication-InitializingConsumers-cmd

Although this (probably) doesn't have anything to do with your issue:
1) I strongly encourage you to upgrade to the latest EPEL5 version of 
389-ds-base which is based on 389-ds-base-1.2.11 - there is a version in 
EPEL5 testing now.
2) Then I strongly encourage you to use the latest version of the docs
https://access.redhat.com/site/documentation/en-US/Red_Hat_Directory_Server/9.0/html/Administration_Guide/Populating_Directory_Databases.html#Importing_Data-Initializing_a_Database_from_the_Console

When the replication is triggered, a few thousand lines appear on the remote side's log that look like this:

[23/Nov/2013:15:00:07 +0000] conn=4402 op=0 BIND dn="cn=Replication Manager,cn=config" method=128 version=3
[23/Nov/2013:15:00:07 +0000] conn=4402 op=0 RESULT err=0 tag=97 nentries=0 etime=1 dn="cn=replication manager,cn=config"
[23/Nov/2013:15:00:07 +0000] conn=4402 op=1 SRCH base="" scope=0 filter="(objectClass=*)" attrs="supportedControl supportedExtension"
[23/Nov/2013:15:00:07 +0000] conn=4402 op=1 RESULT err=0 tag=101 nentries=1 etime=0
[23/Nov/2013:15:00:07 +0000] conn=4402 op=2 SRCH base="" scope=0 filter="(objectClass=*)" attrs="supportedControl supportedExtension"
[23/Nov/2013:15:00:07 +0000] conn=4402 op=2 RESULT err=0 tag=101 nentries=1 etime=0
[23/Nov/2013:15:00:07 +0000] conn=4402 op=3 EXT oid="2.16.840.1.113730.3.5.12" name="replication-multimaster-extop"
[23/Nov/2013:15:00:07 +0000] conn=4402 op=3 RESULT err=0 tag=120 nentries=0 etime=0
[23/Nov/2013:15:00:08 +0000] conn=4402 op=4 EXT oid="2.16.840.1.113730.3.5.6" name="Netscape Replication Total Update Entry"
[23/Nov/2013:15:00:08 +0000] conn=4402 op=4 RESULT err=0 tag=120 nentries=0 etime=0
[23/Nov/2013:15:00:08 +0000] conn=4402 op=5 EXT oid="2.16.840.1.113730.3.5.6" name="Netscape Replication Total Update Entry"
[23/Nov/2013:15:00:08 +0000] conn=4402 op=5 RESULT err=0 tag=120 nentries=0 etime=0
[snip a few thousand log entries all saying err=0]

Right.  They will all say err=0 until the last one has err=N where N > 
0, if there is some sort of error condition.

The side that I initialised the replication from lists this message as the status, which is too vague to be useful:

nsds5replicaLastInitStatus: -2 Total update abortedSystem error

Does anyone know what the error "-2" means?

Negative numbers usually mean some sort of connection error.

Does anyone have any clear and unambiguous instructions for re-initialising two LDAP servers that have gone out of sync?

You are following the correct steps.  The problem is that there is some 
sort of exceptional condition preventing replication from successfully 
completing initialization.

Are there any errors in the errors logs from either the supplier or the 
consumer from around this time?

Regards,
Graham
--

--
389 users mailing list
389-users@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/389-users

--
389 users mailing list
389-users@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/389-users