On Tue, Feb 14, 2012 at 19:54:39 -0600, Rich Megginson wrote: > On 02/14/2012 06:37 PM, Iain Morgan wrote: > > Hello, > > > > On a fairly frequent basis, one of my 389 DS servers hangs after certain > > CMP operations. Once this happens, the server cannot be shutdown > > gracefully. This has been going on for several weeks, and I have not yet > > found a solution. > > > > My setup consists of two systems running RHEL 6.2 with 389 DS 1.2.9.16. > > Multimaster replication is enabled between the two servers, but the > > client systems (currently just two test systems) preferrentially use the > > same server, ServerA. The second server, ServerB, is the one which is > > experiencing the problem. > > > > We are using class-of-service entries to to set the values for the > > shadowMax, shadowMin, and shadowWarning attributes. And we are > > conditionally setting a pwdPolicySubentry attribute for some entries in > > the same manner. > > > > If I execute an ldapcompare command, such as the following: > > > > # ldapcompare uid=imorgan,ou=People,dc=example,dc=com \ > > pwdpolicysubentry:"cn=Special Policy,ou=Policies,dc=example,dc=com" > > > > the command will occassionally hang. Most of the time, the command > > succeeds and indicates that the attribute is not defined for that entry. > > However, once or twice a day it will simply hang. > > > > The access log shows that the CMP request was received, but no result is > > logged. After this occurs, the server will not shut down gracefully. The > > init script fails to shut down the server and I end up having to send a > > SIGKILL to ns-slapd. > When you get the hang, can you attach to the process with gdb? > ps -ef|grep ns-slapd > gdb /usr/sbin/ns-slapd pid-of-ns-slapd > > The error log does not report any issues. > > > > CMP operations against other attributes, such as loginShell, do not seem > > to exhibit this problem. Also, the problem does not occur on ServerA; > > only on ServerB. Once the CMP operation has hung, comparisons against > > other attributes, even shadowMax, continue to work. > > > > As noted above, most of the time the CMP operation returns normally. > > However, if I reinitialize ServerB from ServerA, the problem occurs with > > the first CMP operation against ServerB. > > > > Both servers have the same set of RPMs and the dse.ldif on both systems > > do not have any significant differences. > > > > Has anyone seen a similar issue? Any suggestions on how to debug of fix > > this? > > > > A somewhat simplified and redacted version of the class-of-service > > configuration is listed below. > > > > Thanks A gzip'd copy of the 'thread apply all bt full' output is attached. -- Iain Morgan
Attachment:
389-debug.out.gz
Description: application/gunzip
-- 389 users mailing list 389-users@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/389-users