Robert Viduya wrote: > > On Nov 10, 2009, at 3:25 PM, Rich Megginson wrote: >>> >> No. Check the access log to see what operations were submitted to >> the directory server at or around [10/Nov/2009:08:56:27 -0500] >> >> Are your servers in time sync? Is cn=people rewbell gertrude the >> agreement that sends updates to the master that is having the >> spontaneous import problem? > > Yes, "cn=people rewbell gertrude" is that agreement. We name our > agreements with the name of the database ("people") followed by the > name of the source server ("rewbell") followed by the name of the > destination server ("gertrude"). Gertrude is the master that had the > spontaneous import problem. > > We have ntpd running on all our machines, so all times should be in sync. > > We have a lot going on, but I found the following in the access log on > the problem server: > > [10/Nov/2009:08:55:19 -0500] conn=158624 op=71 RESULT err=0 tag=103 > nentries=0 etime=0 csn=4af96f38000000370000 > [10/Nov/2009:08:55:19 -0500] conn=158624 op=72 MOD > dn="gtdirguid=54e468fd7520dbdb1da52263fb4a6576,ou=people,dc=gted,dc=gatech,dc=edu" > > [10/Nov/2009:08:55:19 -0500] conn=158624 op=72 RESULT err=0 tag=103 > nentries=0 etime=0 csn=4af96f38000100370000 > [10/Nov/2009:08:55:19 -0500] conn=158624 op=73 MOD > dn="gtdirguid=af56dc17bc397b8371b9eed7d91b2af2,ou=people,dc=gted,dc=gatech,dc=edu" > > [10/Nov/2009:08:55:19 -0500] conn=158624 op=73 RESULT err=0 tag=103 > nentries=0 etime=0 csn=4af96f38000500370000 > [10/Nov/2009:08:55:19 -0500] conn=158624 op=74 MOD > dn="gtdirguid=5261a6e61f97bca3f2741b428255daf4,ou=people,dc=gted,dc=gatech,dc=edu" > > [10/Nov/2009:08:55:19 -0500] conn=158624 op=74 RESULT err=0 tag=103 > nentries=0 etime=0 csn=4af96f39000000370000 > [10/Nov/2009:08:55:19 -0500] conn=158624 op=75 MOD > dn="gtdirguid=ad5dddfb9ec4fe230644dc060ce641c7,ou=people,dc=gted,dc=gatech,dc=edu" > > [10/Nov/2009:08:55:19 -0500] conn=158624 op=75 RESULT err=0 tag=103 > nentries=0 etime=0 csn=4af96f39000100370000 > [10/Nov/2009:08:55:19 -0500] conn=158624 op=76 MOD > dn="gtdirguid=0a87d4434e42edf9df3019b922fe2b63,ou=people,dc=gted,dc=gatech,dc=edu" > > [10/Nov/2009:08:55:19 -0500] conn=158624 op=76 RESULT err=0 tag=103 > nentries=0 etime=0 csn=4af96f39000200370000 > [10/Nov/2009:08:55:20 -0500] conn=158624 op=77 MOD > dn="gtdirguid=8be9c47e00957f900b69af162cba8adf,ou=people,dc=gted,dc=gatech,dc=edu" > > [10/Nov/2009:08:55:20 -0500] conn=158624 op=77 RESULT err=0 tag=103 > nentries=0 etime=0 csn=4af96f7b000000370000 > [10/Nov/2009:08:55:20 -0500] conn=158624 op=78 MOD > dn="gtdirguid=beef247b54a6e1289f3e392865e08e17,ou=people,dc=gted,dc=gatech,dc=edu" > > [10/Nov/2009:08:55:20 -0500] conn=158624 op=78 RESULT err=0 tag=103 > nentries=0 etime=0 csn=4af96f7b000400370000 > [10/Nov/2009:08:55:20 -0500] conn=158624 op=79 MOD > dn="gtdirguid=db2aa7040d3b982cf1a32aad503d7e6f,ou=people,dc=gted,dc=gatech,dc=edu" > > [10/Nov/2009:08:55:20 -0500] conn=158624 op=79 RESULT err=0 tag=103 > nentries=0 etime=0 csn=4af96f7b000500370000 > [10/Nov/2009:08:55:20 -0500] conn=158624 op=80 MOD > dn="gtdirguid=c9390deb6018475b4d5d5267ba70fb99,ou=people,dc=gted,dc=gatech,dc=edu" > > [10/Nov/2009:08:55:20 -0500] conn=158624 op=80 RESULT err=0 tag=103 > nentries=0 etime=0 csn=4af96f7b000600370000 > [10/Nov/2009:08:55:20 -0500] conn=158624 op=81 EXT > oid="2.16.840.1.113730.3.5.5" name="Netscape Replication End Session" > [10/Nov/2009:08:55:20 -0500] conn=158624 op=81 RESULT err=0 tag=120 > nentries=0 etime=0 > [10/Nov/2009:08:55:24 -0500] conn=158624 op=83 EXT > oid="2.16.840.1.113730.3.5.3" name="Netscape Replication Start Session" > [10/Nov/2009:08:55:24 -0500] conn=158624 op=83 RESULT err=0 tag=120 > nentries=0 etime=0 > [10/Nov/2009:08:55:28 -0500] conn=158624 op=84 EXT > oid="2.16.840.1.113730.3.5.3" name="Netscape Replication Start Session" > [10/Nov/2009:08:55:28 -0500] conn=158624 op=84 RESULT err=0 tag=120 > nentries=0 etime=0 > [10/Nov/2009:08:56:06 -0500] conn=158624 op=86 EXT > oid="2.16.840.1.113730.3.5.3" name="Netscape Replication Start Session" > [10/Nov/2009:08:56:06 -0500] conn=158624 op=87 EXT > oid="2.16.840.1.113730.3.5.3" name="Netscape Replication Start Session" > [10/Nov/2009:08:56:07 -0500] conn=158624 op=87 RESULT err=0 tag=120 > nentries=0 etime=1 > [10/Nov/2009:08:56:10 -0500] conn=158624 op=88 EXT > oid="2.16.840.1.113730.3.5.3" name="Netscape Replication Start Session" > [10/Nov/2009:08:56:27 -0500] conn=158624 op=88 RESULT err=0 tag=120 > nentries=0 etime=17 > [10/Nov/2009:08:56:59 -0500] conn=158624 op=89 EXT > oid="2.16.840.1.113730.3.5.3" name="Netscape Replication Start Session" > [10/Nov/2009:08:56:59 -0500] conn=158624 op=89 RESULT err=0 tag=120 > nentries=0 etime=0 > [10/Nov/2009:08:57:59 -0500] conn=158624 op=91 UNBIND > [10/Nov/2009:08:57:59 -0500] conn=158624 op=91 fd=64 closed - U1 > > You can see that replication was working until around 8:55:20 and then > there's a series of "Replication Start Session" operations with no > corresponding "End Session". The log messages from the supplier > server (which I sent in the previous mail) seem to indicate that some > communication problems may have occurred (all those "Unable to receive > response" logs). > > Digging through the audit-log, I found the following: > > time: 20091110085608 > dn: cn="ou=people,dc=gted,dc=gatech,dc=edu",cn=mapping tree,cn=config > changetype: modify > replace: nsslapd-referral > nsslapd-referral: > ldap://rewbell.iam.gatech.edu:389/ou%3Dpeople%2Cdc%3Dgted%2C > dc%3Dgatech%2Cdc%3Dedu > - > replace: modifiersname > modifiersname: cn=server,cn=plugins,cn=config > - > replace: modifytimestamp > modifytimestamp: 20091110135607Z > - > > time: 20091110085610 > dn: cn="ou=people,dc=gted,dc=gatech,dc=edu",cn=mapping tree,cn=config > changetype: modify > replace: nsslapd-referral > nsslapd-referral: > ldaps://rw.gted.gatech.edu:636/ou=people,dc=gted,dc=gatech,d > c=edu > - > replace: modifiersname > modifiersname: cn=server,cn=plugins,cn=config > - > replace: modifytimestamp > modifytimestamp: 20091110135610Z > - > > time: 20091110085625 > dn: cn="ou=people,dc=gted,dc=gatech,dc=edu",cn=mapping tree,cn=config > changetype: modify > replace: nsslapd-state > nsslapd-state: referral > - > replace: modifiersname > modifiersname: cn=server,cn=plugins,cn=config > - > replace: modifytimestamp > modifytimestamp: 20091110135625Z > - > > time: 20091110085626 > dn: cn="ou=people,dc=gted,dc=gatech,dc=edu",cn=mapping tree,cn=config > changetype: modify > replace: nsslapd-state > nsslapd-state: backend > - > replace: modifiersname > modifiersname: cn=server,cn=plugins,cn=config > - > replace: modifytimestamp > modifytimestamp: 20091110135626Z > - > > The referral for that suffix first gets changed > "ldap://rewbell.iam.gatech.edu:389/...", which isn't right. > > Then it get changed to "ldaps://rw.gted.gatech.edu:636/..." which is > correct (and what we have configured in the nsDS5ReplicaReferral > attribute). > > Then the state gets changed a couple of times too. Not sure if this > is a cause or effect of the problem we're seeing, but I'm inclined to > think it's an effect of the server thinking it's been told to do an > import, not a cause. IMHO, of course. Yes, I think you are correct. The behavior in the logs on the consumer is consistent with that. When a supplier starts an initialization of a consumer, it sets the consumer's database to referral mode - that is, the consumer should not attempt to process any requests for that database during initialization - it's database is offline while being initialized. Another odd thing is that op=85 is not logged - not sure what happened there. It would be useful to take a look at the supplier access and error logs from around 08:55:28 to 08:56:27 > > -- > 389 users mailing list > 389-users at redhat.com > https://www.redhat.com/mailman/listinfo/fedora-directory-users -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3258 bytes Desc: S/MIME Cryptographic Signature Url : http://lists.fedoraproject.org/pipermail/389-users/attachments/20091112/6ebd9f19/attachment.bin