Re: [389-users] csngen_adjust_time: adjustment limit exceeded

Rich Megginson <rmeggins@xxxxxxxxxx> · Mon, 16 Aug 2010 09:04:57 -0600

Juan Asensio Sánchez wrote:
> Hi
>
> I am having problems with some replicas. Using 389 DS 1.2.5, CentOS 
> 5.5. A few days ago, a server crashed, and when restarted, it had the 
> time of the crash (more than 1 day). Just after the server started up, 
> the time was sync with the NTP, but when dirsrv started, the time was 
> wrong. Since that, the replication agreements of the multimaster 
> database it hosts, is giving problems: "-1 Incremental update has 
> failed and requires administrator actionSystem error". So, I am trying 
> to initialize the rest of the servers from the "main" (tha server 
> where the most os the modifications are done, we have 6 servers in 
> multimaster mode for the database, and other databases in hub mode). 
> When I try to initialize the server, i get this error on the supplier: 
> "Replication error acquiring replica: excessive clock skew. Error 
> Code: 2", although all the servers have the same time. In the consumer 
> log, I get this:
>
> [16/Aug/2010:10:04:58 +0200] - csngen_adjust_time: adjustment limit 
> exceeded; value - 1390893, limit - 86400
> [16/Aug/2010:10:04:58 +0200] - CSN generator's state:
> [16/Aug/2010:10:04:58 +0200] -  replica id: 5
> [16/Aug/2010:10:04:58 +0200] -  sampled time: 1281945898
> [16/Aug/2010:10:04:58 +0200] -  local offset: 0
> [16/Aug/2010:10:04:58 +0200] -  remote offset: 0
> [16/Aug/2010:10:04:58 +0200] -  sequence number: 111
>
> I am stuck now. Tried to export database from supplier, import it in 
> the consumer, and try to reinitialize without success. Also tried to 
> disable the replica on both supplier and consumer, reenable it, and 
> recreate the replication agreements without success. I have seen this 
> bug https://bugzilla.redhat.com/show_bug.cgi?id=233642, but we have 
> version 1.2.5, so his bug is supposed to be fixed. This is the result 
> of the readNsState.py on the supplier (only for the database giving 
> problems):
>
> nsState is BAAAADT2aEwAAAAAAQAAAAQAAAA=
> Little Endian
> For replica cn=replica, cn="dc=XXXXX,dc=XXXX", cn=mapping tree, cn=config
>   fmtstr=[H2x3IH2x]
>   size=20
>   len of nsstate is 20
>   CSN generator state:
>     Replica ID    : 4
>     Sampled Time  : 1281947188
>     Gen as csn    : 4c68f634000400040000
>     Time as str   : Mon Aug 16 10:26:28 2010
>     Local Offset  : 0
>     Remote Offset : 1
>     Seq. num      : 4
>     System time   : Mon Aug 16 10:26:42 2010
>     Diff in sec.  : 14
>     Day:sec diff  : 0:14
>
>
> And this in the consumer:
>
> nsState is BQAAAPv1aEwAAAAAAAAAAAIAAAA=
> Little Endian
> For replica cn=replica, cn="dc=XXX,dc=XXXXX", cn=mapping tree, cn=config
>   fmtstr=[H2x3IH2x]
>   size=20
>   len of nsstate is 20
>   CSN generator state:
>     Replica ID    : 5
>     Sampled Time  : 1281947131
>     Gen as csn    : 4c68f5fb000200050000
>     Time as str   : Mon Aug 16 10:25:31 2010
>     Local Offset  : 0
>     Remote Offset : 0
>     Seq. num      : 2
>     System time   : Mon Aug 16 10:26:24 2010
>     Diff in sec.  : 53
>     Day:sec diff  : 0:53
>
> I think the low remote offset (accoriding to the bug this number 
> should increase with the changes) is due to the initialization of the 
> database from the exports. Any help? All replication agreements are a 
> disaster now :S.
The bug that caused this to happen was fixed, but unfortunately cannot 
fix the bad nsState that already exists.  The problem is that the CSN 
generator attribute (nsState) in the cn=replica entry for the suffx is 
not cleaned up properly when you re-init replication.  In general, you 
can't do this, because you could generate CSNs that you have generated 
before.

I think the solution here is to first unconfigure replication, then 
shutdown the servers, then dump the database(s) to LDIF, then remove the 
nsState attribute.  You will have to do this on every server.  Then, 
start up, reconfigure replication, reload the data, and re-init all of 
the other replicas.  Make sure all of your servers are in time sync 
before you begin.

I know this is a pain but I don't know any other way to get rid of the 
bad nsState.
>
> Regards and thanks in advance.
>
> ------------------------------------------------------------------------
>
> --
> 389 users mailing list
> 389-users@xxxxxxxxxxxxxxxxxxxxxxx
> https://admin.fedoraproject.org/mailman/listinfo/389-users

--
389 users mailing list
389-users@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/389-users