Re: [389-users] Multimaster replication out of sync

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Mitja Mihelič wrote:


On 12/12/2009 12:06 AM, Rich Megginson wrote:
Mitja Mihelič wrote:


On 12/07/2009 05:18 PM, Rich Megginson wrote:
Mitja Mihelic wrote:
Hi!

We have two instances of the DS in a multimaster replication setup.
We had to restore the database of one of the servers from backup.
While the second master was down, the first was receiving updates.
After we fired up the restored master it started receiving updates as
soon as a change occurred on the first master (i.e. after 15 minutes)
After the sync finished, we noticed they weren't identical.
Clicking "Send updates now" from the replication agreement does not help.

Is there a way to get them synced up again ? Other than reinitializing
the second/restored master ?
How long was the server down? How old was the backup it was restored from?
The server was not down long, but the backup was about 10 hours old.
This was a backup at filesystem level made by ufsdump. It was not a "regular" DS backup.
When we restored the database file from the dump the server booted OK.

Then we made little test:
- made another ufsdump of the second master
- shut down the server
- let the primary master update for an hour
- restored the second master's database from the dump
- started the second master
- let them do their replication magic
- isolated both servers (i.e. no updates)
- compared the LDIF dumps
Again, they were not the same.

We probably should have used the built in backup functionality, right ?
Yes, although I'm not sure what would be causing the problems you see.

In general, when the database state changes, you have to reinitialize replication.
We tried the built-in backup:
/usr/lib/dirsrv/serverReplica/db2bak /var/lib/dirsrv/serverReplica/bak/`date +%Y_%m_%d_%H_%M_%S`

Executed the same test procedure as described above.

There are still entries on the primary server that do not get replayed on the secondary.

An error message (repeated every 5 minutes) from the primary master SERVER1 occurs when a record, that is missing on the secondary, gets updated on the primary: [16/Dec/2009:10:26:02 +0100] NSMMReplicationPlugin - agmt="cn=MM to SERVER2" (SERVER2:389): Consumer failed to replay change (uniqueid 25ab6e01-1dd211b2-bdbbda0a-92130000, CSN 4b28a7ac0000000b0000): No such object. Skipping.

My reasoning would be: if the entry does not exist on the consumer, create it. But I guest that is not how the mechanism works.
I'm still scratching my head about this one...
In general, if you restore or otherwise change a database, that server will have to be reinitialized in order for replication to work.

Regards,
Mitja

--
389 users mailing list
389-users@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/fedora-directory-users

--
389 users mailing list
389-users@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/fedora-directory-users

[Index of Archives]     [Fedora Directory Users]     [Fedora Directory Devel]     [Fedora Announce]     [Fedora Legacy Announce]     [Kernel]     [Fedora Legacy]     [Share Photos]     [Fedora Desktop]     [PAM]     [Red Hat Watch]     [Red Hat Development]     [Big List of Linux Books]     [Gimp]     [Yosemite News]

  Powered by Linux