mark benschop wrote: > Hi Rich, > Thanks for your reply. > Please find the logging from the problems below. > The serverb55 is one of 2 servers in a multiple masters configuration > that consists of serverb55 and serverb05. > > The problem I inititially had was that I had 2 entries that could not > be deleted serverb55. > > Here's logging from the access file. > ======================================================================= > access.20100614-092820:[15/Jun/2010:09:20:49 +0200] conn=342177 op=7 > SRCH base="uid=dbeijk, ou=people, dc=directory,dc=intern" scope=0 > filter="(objectClass=*)" attrs=ALL > access.20100614-092820:[15/Jun/2010:09:20:49 +0200] conn=342177 op=7 > RESULT err=0 tag=101 nentries=1 etime=0 > access.20100614-092820:[15/Jun/2010:09:22:08 +0200] conn=342177 op=8 > SRCH base="uid=dbeijk, ou=people, dc=directory,dc=intern" scope=1 > filter="(objectClass=*)" attrs="objectClass" > access.20100614-092820:[15/Jun/2010:09:22:08 +0200] conn=342177 op=8 > RESULT err=0 tag=101 nentries=0 etime=0 notes=U > access.20100614-092820:[15/Jun/2010:09:22:08 +0200] conn=342177 op=9 > DEL dn="uid=dbeijk, ou=people, dc=directory,dc=intern" > access.20100614-092820:[15/Jun/2010:09:22:08 +0200] conn=342177 op=9 > RESULT err=1 tag=107 nentries=0 etime=0 csn=4c172a21000000370000 > access.20100614-092820:[15/Jun/2010:09:22:08 +0200] conn=342177 op=10 > SRCH base="uid=dbeijk, ou=people, dc=directory,dc=intern" scope=1 > filter="(objectClass=*)" attrs="objectClass" > access.20100614-092820:[15/Jun/2010:09:22:08 +0200] conn=342177 op=10 > RESULT err=0 tag=101 nentries=0 etime=0 notes=U > ======================================================================= > > LDAP error 1 i found means 'unwiling to perform'. First I thought > something might be wrong with the entry itself. > The error log found in the error log from the serverb55 I've added > below seemed to point in that direction. > > > When I logged on the the other ldapserver, serverb05, I tried to > delete the same entry to see if this slapd had the same issue but here > it worked. > Replicating the delete didn't. The following error was logged to the > errorlog of this : > > ======================================================================== > [15/Jun/2010:09:35:17 +0200] NSMMReplicationPlugin - > agmt="cn=serverb55" (serverb55:636): Consumer failed to replay change > (uniqueid a276337c-5dc511df-852cfef8-667fa4d4, CSN > 4c172d36000000050000): Operations error. Will retry later. > ======================================================================= > > So there seemed to be a problem with the serverb55 only. > Since I assumed the database got somehow corrupt or inconsistent I've > tried the following steps to try and recreate the database or had it > checked in order to get it right again. > First there's the errors from the account that could not be deleted. > I 'reinitialised the consumer' from the working serverb05 to the > problematic serverb55. > Then I restarted the slapd. > Made an export of the database and imported that. > Slapd stopped the database. > > Please find the logging from /var/log/dirsrv/slapd-serverb55/errors > from the actions leading to the problem of the fatal server stop. > ====================================================================== > CentOS-Directory/8.1.0 > B2009.134.1334 > > serverb55:636 > (/etc/dirsrv/slapd-serverb55) > > > > [15/Jun/2010:09:22:58 +0200] - Entry "uid=dbeijk, ou=People, > dc=directory,dc=intern" missing attribute "uidNumber" required by > object class "posixAccount" > [15/Jun/2010:09:22:58 +0200] - Entry "uid=dbeijk, ou=People, > dc=directory,dc=intern" missing attribute "gidNumber" required by > object class "posixAccount" > [15/Jun/2010:09:22:58 +0200] - Entry "uid=dbeijk, ou=People, > dc=directory,dc=intern" -- attribute "uidNumber" not allowed > [15/Jun/2010:09:22:58 +0200] - Entry "uid=dbeijk, ou=People, > dc=directory,dc=intern" missing attribute "uid" required by object > class "posixAccount" > [15/Jun/2010:09:22:58 +0200] - Entry "uid=dbeijk, ou=People, > dc=directory,dc=intern" missing attribute "cn" required by object > class "posixAccount" > [15/Jun/2010:09:22:58 +0200] - Entry "uid=dbeijk, ou=People, > dc=directory,dc=intern" missing attribute "homeDirectory" required by > object class "posixAccount" > [15/Jun/2010:09:23:04 +0200] - Entry "uid=dbeijk, ou=People, > dc=directory,dc=intern" missing attribute "uidNumber" required by > object class "posixAccount" > [15/Jun/2010:09:23:18 +0200] - Entry "uid=dbeijk, ou=People, > dc=directory,dc=intern" missing attribute "uidNumber" required by > object class "posixAccount" > [15/Jun/2010:09:23:18 +0200] - Entry "uid=dbeijk, ou=People, > dc=directory,dc=intern" missing attribute "gidNumber" required by > object class "posixAccount" > [15/Jun/2010:09:23:18 +0200] - Entry "uid=dbeijk, ou=People, > dc=directory,dc=intern" required attribute "objectclass" missing > [15/Jun/2010:09:23:18 +0200] - Entry "uid=dbeijk, ou=People, > dc=directory,dc=intern" required attribute "objectclass" missing > [15/Jun/2010:09:23:18 +0200] - Entry "uid=dbeijk, ou=People, > dc=directory,dc=intern" missing attribute "uid" required by object > class "posixAccount" > [15/Jun/2010:09:23:18 +0200] - Entry "uid=dbeijk, ou=People, > dc=directory,dc=intern" missing attribute "cn" required by object > class "posixAccount" > [15/Jun/2010:09:23:18 +0200] - Entry "uid=dbeijk, ou=People, > dc=directory,dc=intern" missing attribute "homeDirectory" required by > object class "posixAccount" > [15/Jun/2010:09:24:56 +0200] - Entry "uid=DEL *.*, ou=People, > dc=directory,dc=intern" missing attribute "homeDirectory" required by > object class "posixAccount" > [15/Jun/2010:09:50:20 +0200] - Entry "cn=wchiman, ou=people, > dc=directory,dc=intern" -- attribute "uidNumber" not allowed > [15/Jun/2010:10:12:43 +0200] NSMMReplicationPlugin - > multimaster_be_state_change: replica dc=directory,dc=intern is going > offline; disabling replication > [15/Jun/2010:10:12:43 +0200] - attrcrypt_unwrap_key: failed to unwrap > key for cipher AES > [15/Jun/2010:10:12:43 +0200] - Failed to retrieve key for cipher AES > in attrcrypt_cipher_init > [15/Jun/2010:10:12:43 +0200] - Failed to initialize cipher AES in > attrcrypt_init > [15/Jun/2010:10:12:43 +0200] - WARNING: Import is running with > nsslapd-db-private-import-mem on; No other process is allowed to > access the database > [15/Jun/2010:10:12:48 +0200] - import userRoot: Workers finished; > cleaning up... > [15/Jun/2010:10:12:49 +0200] - import userRoot: Workers cleaned up. > [15/Jun/2010:10:12:49 +0200] - import userRoot: Indexing complete. > Post-processing... > [15/Jun/2010:10:12:49 +0200] - import userRoot: Flushing caches... > [15/Jun/2010:10:12:49 +0200] - import userRoot: Closing files... > [15/Jun/2010:10:12:49 +0200] - import userRoot: Import complete. > Processed 4849 entries in 6 seconds. (808.17 entries/sec) > [15/Jun/2010:10:12:49 +0200] - attrcrypt_unwrap_key: failed to unwrap > key for cipher AES > [15/Jun/2010:10:12:49 +0200] - Failed to retrieve key for cipher AES > in attrcrypt_cipher_init > [15/Jun/2010:10:12:49 +0200] - Failed to initialize cipher AES in > attrcrypt_init > [15/Jun/2010:10:12:49 +0200] NSMMReplicationPlugin - > multimaster_be_state_change: replica dc=directory,dc=intern is coming > online; enabling replication > [15/Jun/2010:10:12:49 +0200] NSMMReplicationPlugin - > replica_reload_ruv: Warning: new data for replica > dc=directory,dc=intern does not match the data in the changelog. > Recreating the changelog file. This could affect replication with > replica's consumers in which case the consumers should be reinitialized. > [15/Jun/2010:10:12:49 +0200] - skipping cos definition > cn=nsAccountInactivation_cos,dc=directory,dc=intern--no templates found > [15/Jun/2010:10:55:43 +0200] NSMMReplicationPlugin - > multimaster_be_state_change: replica dc=directory,dc=intern is going > offline; disabling replication > [15/Jun/2010:10:55:43 +0200] - attrcrypt_unwrap_key: failed to unwrap > key for cipher AES > [15/Jun/2010:10:55:43 +0200] - Failed to retrieve key for cipher AES > in attrcrypt_cipher_init > [15/Jun/2010:10:55:43 +0200] - Failed to initialize cipher AES in > attrcrypt_init > [15/Jun/2010:10:55:43 +0200] - WARNING: Import is running with > nsslapd-db-private-import-mem on; No other process is allowed to > access the database > [15/Jun/2010:10:55:49 +0200] - import userRoot: Workers finished; > cleaning up... > [15/Jun/2010:10:55:49 +0200] - import userRoot: Workers cleaned up. > [15/Jun/2010:10:55:49 +0200] - import userRoot: Indexing complete. > Post-processing... > [15/Jun/2010:10:55:49 +0200] - import userRoot: Flushing caches... > [15/Jun/2010:10:55:49 +0200] - import userRoot: Closing files... > [15/Jun/2010:10:55:49 +0200] - import userRoot: Import complete. > Processed 4850 entries in 5 seconds. (970.00 entries/sec) > [15/Jun/2010:10:55:49 +0200] - attrcrypt_unwrap_key: failed to unwrap > key for cipher AES > [15/Jun/2010:10:55:49 +0200] - Failed to retrieve key for cipher AES > in attrcrypt_cipher_init > [15/Jun/2010:10:55:49 +0200] - Failed to initialize cipher AES in > attrcrypt_init > [15/Jun/2010:10:55:49 +0200] NSMMReplicationPlugin - > multimaster_be_state_change: replica dc=directory,dc=intern is coming > online; enabling replication > [15/Jun/2010:10:55:49 +0200] NSMMReplicationPlugin - > replica_reload_ruv: Warning: new data for replica > dc=directory,dc=intern does not match the data in the changelog. > Recreating the changelog file. This could affect replication with > replica's consumers in which case the consumers should be reinitialized. > [15/Jun/2010:10:55:49 +0200] - skipping cos definition > cn=nsAccountInactivation_cos,dc=directory,dc=intern--no templates found > [15/Jun/2010:10:59:57 +0200] - slapd shutting down - signaling > operation threads > [15/Jun/2010:10:59:57 +0200] - slapd shutting down - waiting for 26 > threads to terminate > [15/Jun/2010:10:59:57 +0200] - slapd shutting down - closing down > internal subsystems and plugins > [15/Jun/2010:10:59:58 +0200] - Waiting for 4 database threads to stop > [15/Jun/2010:10:59:59 +0200] - All database threads now stopped > [15/Jun/2010:10:59:59 +0200] - slapd stopped. > CentOS-Directory/8.1.0 B2009.134.1334 > <host>:<port> (/etc/dirsrv/slapd-serverb55) > > [15/Jun/2010:11:00:01 +0200] - Entry "cn=schema" single-valued > attribute "modifyTimestamp" has multiple values > CentOS-Directory/8.1.0 B2009.134.1334 > serverb55:636 (/etc/dirsrv/slapd-serverb55) > > [15/Jun/2010:11:00:01 +0200] - CentOS-Directory/8.1.0 B2009.134.1334 > starting up > [15/Jun/2010:11:00:01 +0200] - I'm resizing my cache now...cache was > 20000000 and is now 8000000 > [15/Jun/2010:11:00:01 +0200] - attrcrypt_unwrap_key: failed to unwrap > key for cipher AES > [15/Jun/2010:11:00:01 +0200] - Failed to retrieve key for cipher AES > in attrcrypt_cipher_init > [15/Jun/2010:11:00:01 +0200] - Failed to initialize cipher AES in > attrcrypt_init > [15/Jun/2010:11:00:01 +0200] - attrcrypt_unwrap_key: failed to unwrap > key for cipher AES > [15/Jun/2010:11:00:01 +0200] - Failed to retrieve key for cipher AES > in attrcrypt_cipher_init > [15/Jun/2010:11:00:01 +0200] - Failed to initialize cipher AES in > attrcrypt_init > [15/Jun/2010:11:00:01 +0200] - skipping cos definition > cn=nsAccountInactivation_cos,dc=directory,dc=intern--no templates found > [15/Jun/2010:11:00:01 +0200] NSMMReplicationPlugin - > replica_check_for_data_reload: Warning: data for replica > dc=directory,dc=intern was reloaded and it no longer matches the data > in the changelog (replica data > changelog). Recreating the changelog > file. This could affect replication with replica's consumers in which > case the consumers should be reinitialized. > [15/Jun/2010:11:00:01 +0200] - skipping cos definition > cn=nsAccountInactivation_cos,dc=directory,dc=intern--no templates found > [15/Jun/2010:11:00:01 +0200] - slapd started. Listening on All > Interfaces port 389 for LDAP requests > [15/Jun/2010:11:00:01 +0200] - Listening on All Interfaces port 636 > for LDAPS requests > [15/Jun/2010:11:29:59 +0200] - slapd shutting down - signaling > operation threads > [15/Jun/2010:11:29:59 +0200] - slapd shutting down - closing down > internal subsystems and plugins > [15/Jun/2010:11:30:00 +0200] - Waiting for 4 database threads to stop > [15/Jun/2010:11:30:00 +0200] - All database threads now stopped > [15/Jun/2010:11:30:00 +0200] - slapd stopped. > CentOS-Directory/8.1.0 B2009.134.1334 > <host>:<port> (/etc/dirsrv/slapd-serverb55) > > [15/Jun/2010:11:30:03 +0200] - Entry "cn=schema" single-valued > attribute "modifyTimestamp" has multiple values > CentOS-Directory/8.1.0 B2009.134.1334 > serverb55:636 (/etc/dirsrv/slapd-serverb55) > > [15/Jun/2010:11:30:03 +0200] - CentOS-Directory/8.1.0 B2009.134.1334 > starting up > [15/Jun/2010:11:30:03 +0200] - attrcrypt_unwrap_key: failed to unwrap > key for cipher AES > [15/Jun/2010:11:30:03 +0200] - Failed to retrieve key for cipher AES > in attrcrypt_cipher_init > [15/Jun/2010:11:30:03 +0200] - Failed to initialize cipher AES in > attrcrypt_init > [15/Jun/2010:11:30:03 +0200] - attrcrypt_unwrap_key: failed to unwrap > key for cipher AES > [15/Jun/2010:11:30:03 +0200] - Failed to retrieve key for cipher AES > in attrcrypt_cipher_init > [15/Jun/2010:11:30:03 +0200] - Failed to initialize cipher AES in > attrcrypt_init > [15/Jun/2010:11:30:03 +0200] - skipping cos definition > cn=nsAccountInactivation_cos,dc=directory,dc=intern--no templates found > [15/Jun/2010:11:30:03 +0200] - skipping cos definition > cn=nsAccountInactivation_cos,dc=directory,dc=intern--no templates found > [15/Jun/2010:11:30:03 +0200] - slapd started. Listening on All > Interfaces port 389 for LDAP requests > [15/Jun/2010:11:30:03 +0200] - Listening on All Interfaces port 636 > for LDAPS requests > [15/Jun/2010:11:40:44 +0200] - Beginning export of 'userroot' > [15/Jun/2010:11:40:44 +0200] - export userRoot: Processed 139 entries > (100%). > [15/Jun/2010:11:40:44 +0200] - Export finished. > [15/Jun/2010:11:46:12 +0200] NSMMReplicationPlugin - > multimaster_be_state_change: replica dc=directory,dc=intern is going > offline; disabling replication > [15/Jun/2010:11:46:12 +0200] - attrcrypt_unwrap_key: failed to unwrap > key for cipher AES > [15/Jun/2010:11:46:12 +0200] - Failed to retrieve key for cipher AES > in attrcrypt_cipher_init > [15/Jun/2010:11:46:12 +0200] - Failed to initialize cipher AES in > attrcrypt_init > [15/Jun/2010:11:46:12 +0200] - WARNING: Import is running with > nsslapd-db-private-import-mem on; No other process is allowed to > access the database > [15/Jun/2010:11:46:14 +0200] - libdb: page 1: illegal page type or format > [15/Jun/2010:11:46:14 +0200] - libdb: PANIC: Invalid argument > [15/Jun/2010:11:46:14 +0200] - FATAL ERROR at by MCC ou=people > dc=directory dc=intern (77); server stopping as database recovery needed. > CentOS-Directory/8.1.0 B2009.134.1334 > <host>:<port> (/etc/dirsrv/slapd-serverb55 > ====================================================================== > > Finally my questions : > What could be the cause of the problem ? Not sure - never seen this particular error before - any disk errors in /var/log/messages? Can you try doing a reinit of this server from the other server? > What would be the best procedure to get the serverb55 up and running > again ? Doing a reinit of this server from the other server. > > Thanks for any advise. > > Regards, > Mark > > > > ======= > > > On Tue, Jun 15, 2010 at 7:04 PM, Rich Megginson <rmeggins at redhat.com > <mailto:rmeggins at redhat.com>> wrote: > > mark benschop wrote: > > Hi All, > > > > I'm having a problem on a CentOs Directory Server 8.1 multiple > master > > setup. > > The database of one of the servers has been marked as corrupt > and has > > been brought offline by the Directory Server. > Can you post any relevant error messages from the error log of the > server? > > Ldapclients querying the ldapserver for e.g. loggin in of users > get an > > errormessage, effectively disabling users to log in. > What error message? > > > > I'm wondering what the best method is to recover from this > situation. > > I can think of a few : > > 1) Starting the ldapserver, deleting the database, recreating it and > > restoring a backup. > > > 2) Starting the ldapserver, deleting the database and reinitialising > > the server from the other master. > If you reinitialize the problem server from another server, you don't > need to delete the database, reinit will do that for you. > > > > Can anyone give me some hints if this wil work or would another > > approach be better ? > > > > Thanks for your advise, > > Mark > > > ------------------------------------------------------------------------ > > > > -- > > 389 users mailing list > > 389-users at lists.fedoraproject.org > <mailto:389-users at lists.fedoraproject.org> > > https://admin.fedoraproject.org/mailman/listinfo/389-users > > -- > 389 users mailing list > 389-users at lists.fedoraproject.org > <mailto:389-users at lists.fedoraproject.org> > https://admin.fedoraproject.org/mailman/listinfo/389-users > > > ------------------------------------------------------------------------ > > -- > 389 users mailing list > 389-users at lists.fedoraproject.org > https://admin.fedoraproject.org/mailman/listinfo/389-users