[389-users] Corrupted database again, with 389 MMR replication and TCP errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We are still having this issue, every so often. Sometimes a 389 database 
becomes unresponsive to queries, and the log shows errors  (full log below)

We started a thread back in Nov, but didnt really get a resolution.
    
https://www.redhat.com/archives/fedora-directory-users/2009-November/msg00056.html

The only way to fix it seems to be to re-import a replication export 
from a provider.
I am new, so I'm not sure whether this is the right place to report 
this, or should I submit a bug?

Some background - we have 3 masters that replication to each other and 
to 2 replication hubs. Those hubs replicate to each other and to 12 
slaves in 2 data centers. We have host based firewalls, network based 
firewalls, and routers, and our replications do pass through firewalls 
and routers, even between masters. We have reduced,but still see a lot 
of these errors in the 389 errors log:
  [11/Dec/2009:00:06:19 -0500] - PR_Write(133) Netscape Portable Runtime 
error -5961 (TCP connection reset by peer.)

and we have worked to reduce these errors, but have not been able to 
eliminate them entirely. So, based on that, I am concerned that there 
are replication sessions that might be getting interrupted, and that 
might be contributing to the problem.

The situation doesnt happen daily, or neccessarily correlate to all of 
the TCP errors. But the situation could happen again during the next 
week or so. What other information and logging should we be gathering to 
figure out what is happening here? Should we create a bug on this?

This is the log from this morning at 4am on zuber, a slave, where the 
'accounts' db was affected:

zuber.iam.gatech.edu: cat errors
        389-Directory/1.2.2 B2009.237.2054
        zuber.iam.gatech.edu:636 (/etc/dirsrv/slapd-zuber)

[11/Dec/2009:00:06:19 -0500] - PR_Write(133) Netscape Portable Runtime 
error -5961 (TCP connection reset by peer.)
[11/Dec/2009:00:06:19 -0500] - PR_Write(75) Netscape Portable Runtime 
error -5961 (TCP connection reset by peer.)
[11/Dec/2009:00:06:20 -0500] - PR_Write(161) Netscape Portable Runtime 
error -5961 (TCP connection reset by peer.)
[11/Dec/2009:00:10:43 -0500] - PR_Write(217) Netscape Portable Runtime 
error -5961 (TCP connection reset by peer.)
[11/Dec/2009:00:11:48 -0500] - PR_Write(246) Netscape Portable Runtime 
error -5961 (TCP connection reset by peer.)
[11/Dec/2009:00:14:20 -0500] - PR_Write(256) Netscape Portable Runtime 
error -5961 (TCP connection reset by peer.)
[11/Dec/2009:00:14:24 -0500] - PR_Write(218) Netscape Portable Runtime 
error -5961 (TCP connection reset by peer.)
[11/Dec/2009:00:14:35 -0500] - PR_Write(243) Netscape Portable Runtime 
error -5961 (TCP connection reset by peer.)
[11/Dec/2009:00:14:45 -0500] - PR_Write(264) Netscape Portable Runtime 
error -5961 (TCP connection reset by peer.)
[11/Dec/2009:00:14:54 -0500] - PR_Write(274) Netscape Portable Runtime 
error -5961 (TCP connection reset by peer.)
[11/Dec/2009:00:14:59 -0500] - PR_Write(284) Netscape Portable Runtime 
error -5961 (TCP connection reset by peer.)
[11/Dec/2009:00:15:00 -0500] - PR_Write(287) Netscape Portable Runtime 
error -5961 (TCP connection reset by peer.)
[11/Dec/2009:00:15:00 -0500] - PR_Write(220) Netscape Portable Runtime 
error -5961 (TCP connection reset by peer.)
[11/Dec/2009:04:34:00 -0500] - PR_Write(83) Netscape Portable Runtime 
error -5961 (TCP connection reset by peer.)
[11/Dec/2009:04:34:04 -0500] - PR_Write(224) Netscape Portable Runtime 
error -5961 (TCP connection reset by peer.)
[11/Dec/2009:04:34:07 -0500] - PR_Write(94) Netscape Portable Runtime 
error -5961 (TCP connection reset by peer.)
[11/Dec/2009:04:34:07 -0500] - PR_Write(144) Netscape Portable Runtime 
error -5961 (TCP connection reset by peer.)
[11/Dec/2009:04:34:08 -0500] - PR_Write(222) Netscape Portable Runtime 
error -5961 (TCP connection reset by peer.)
[11/Dec/2009:04:34:08 -0500] - PR_Write(227) Netscape Portable Runtime 
error -5961 (TCP connection reset by peer.)
[11/Dec/2009:04:34:08 -0500] - PR_Write(233) Netscape Portable Runtime 
error -5961 (TCP connection reset by peer.)
[11/Dec/2009:04:34:59 -0500] - ldbm: 'accounts' is already in the middle 
of another task and cannot be disturbed.
[11/Dec/2009:04:34:59 -0500] - slapi_start_bulk_import: failed; error = -23
[11/Dec/2009:04:35:00 -0500] NSMMReplicationPlugin - conn=2847000 op=178 
replica="ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu": 
Unable to acquire replica: error: internal error
[11/Dec/2009:04:35:00 -0500] NSMMReplicationPlugin - 
multimaster_be_state_change: replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu is 
going offline; disabling replication
[11/Dec/2009:04:35:02 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:35:05 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:36:02 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:36:13 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:36:25 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:36:39 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:36:46 -0500] - PR_Write(220) Netscape Portable Runtime 
error -5961 (TCP connection reset by peer.)
[11/Dec/2009:04:37:09 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:37:12 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:37:15 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:37:35 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:38:03 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:38:09 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:38:31 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:38:37 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:39:33 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:39:37 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:39:44 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:39:46 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:39:49 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:39:52 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:39:56 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:39:59 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:40:07 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:40:12 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:40:15 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:40:20 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:40:25 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:40:31 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:40:37 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:40:46 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:40:51 -0500] - WARNING: Import is running with 
nsslapd-db-private-import-mem on; No other process is allowed to access 
the database
[11/Dec/2009:04:40:51 -0500] - ERROR bulk import abandoned
[11/Dec/2009:04:40:51 -0500] - import accounts: Aborting all import 
threads...
[11/Dec/2009:04:41:00 -0500] - import accounts: Import threads aborted.
[11/Dec/2009:04:41:00 -0500] - import accounts: Closing files...
[11/Dec/2009:04:41:00 -0500] - import accounts: Import failed.
[11/Dec/2009:04:41:00 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1
[11/Dec/2009:04:41:29 -0500] NSMMReplicationPlugin - 
replica_replace_ruv_tombstone: failed to update replication update 
vector for replica 
ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu: LDAP 
error - 1

  (this error then repeats for hours)




[Index of Archives]     [Fedora User Discussion]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [Fedora News]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Maintainers]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Legacy]     [Fedora Desktop]     [Fedora Fonts]     [ATA RAID]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Centos]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora QA]     [Fedora Triage]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Tux]     [Yosemite News]     [Yosemite Photos]     [Linux Apps]     [Maemo Users]     [Gnome Users]     [KDE Users]     [Fedora Tools]     [Fedora Art]     [Fedora Docs]     [Maemo Users]     [Asterisk PBX]     [Fedora Sparc]     [Fedora Universal Network Connector]     [Fedora ARM]

  Powered by Linux