On Wed, Feb 15, 2012 at 18:19:10 -0600, Rich Megginson wrote: > On 02/15/2012 03:51 PM, Iain Morgan wrote: > > On Wed, Feb 15, 2012 at 15:04:52 -0600, Rich Megginson wrote: > >> On 02/15/2012 01:56 PM, Iain Morgan wrote: > >>> On Tue, Feb 14, 2012 at 19:54:39 -0600, Rich Megginson wrote: > >>>> On 02/14/2012 06:37 PM, Iain Morgan wrote: > >>>>> Hello, > >>>>> > >>>>> On a fairly frequent basis, one of my 389 DS servers hangs after certain > >>>>> CMP operations. Once this happens, the server cannot be shutdown > >>>>> gracefully. This has been going on for several weeks, and I have not yet > >>>>> found a solution. > >>>>> > >>>>> My setup consists of two systems running RHEL 6.2 with 389 DS 1.2.9.16. > >>>>> Multimaster replication is enabled between the two servers, but the > >>>>> client systems (currently just two test systems) preferrentially use the > >>>>> same server, ServerA. The second server, ServerB, is the one which is > >>>>> experiencing the problem. > >>>>> > >>>>> We are using class-of-service entries to to set the values for the > >>>>> shadowMax, shadowMin, and shadowWarning attributes. And we are > >>>>> conditionally setting a pwdPolicySubentry attribute for some entries in > >>>>> the same manner. > >>>>> > >>>>> If I execute an ldapcompare command, such as the following: > >>>>> > >>>>> # ldapcompare uid=imorgan,ou=People,dc=example,dc=com \ > >>>>> pwdpolicysubentry:"cn=Special Policy,ou=Policies,dc=example,dc=com" > >>>>> > >>>>> the command will occassionally hang. Most of the time, the command > >>>>> succeeds and indicates that the attribute is not defined for that entry. > >>>>> However, once or twice a day it will simply hang. > >>>>> > >>>>> The access log shows that the CMP request was received, but no result is > >>>>> logged. After this occurs, the server will not shut down gracefully. The > >>>>> init script fails to shut down the server and I end up having to send a > >>>>> SIGKILL to ns-slapd. > >>>> When you get the hang, can you attach to the process with gdb? > >>>> ps -ef|grep ns-slapd > >>>> gdb /usr/sbin/ns-slapd pid-of-ns-slapd > >>>>> The error log does not report any issues. > >>>>> > >>>>> CMP operations against other attributes, such as loginShell, do not seem > >>>>> to exhibit this problem. Also, the problem does not occur on ServerA; > >>>>> only on ServerB. Once the CMP operation has hung, comparisons against > >>>>> other attributes, even shadowMax, continue to work. > >>>>> > >>>>> As noted above, most of the time the CMP operation returns normally. > >>>>> However, if I reinitialize ServerB from ServerA, the problem occurs with > >>>>> the first CMP operation against ServerB. > >>>>> > >>>>> Both servers have the same set of RPMs and the dse.ldif on both systems > >>>>> do not have any significant differences. > >>>>> > >>>>> Has anyone seen a similar issue? Any suggestions on how to debug of fix > >>>>> this? > >>>>> > >>>>> A somewhat simplified and redacted version of the class-of-service > >>>>> configuration is listed below. > >>>>> > >>>>> Thanks > >>> A gzip'd copy of the 'thread apply all bt full' output is attached. > >>> > >> Thanks. Can you do this again after installing the > >> 389-ds-base-debuginfo package? > >> debuginfo-install 389-ds-base > > Ah, sorry about that. Here's the updated output. > > > >> Are you using Views? > >> http://docs.redhat.com/docs/en-US/Red_Hat_Directory_Server/9.0/html/Administration_Guide/using-views.html > > No. > > > Thanks! This looks like a symptom of > https://fedorahosted.org/389/ticket/247 fixed in 1.2.10 Hello Rich, Thanks, I upgraded both of the servers to 1.2.10.1. Unfortunately, it did not resolve the issue. I also noticed that if I run the same ldapcompare command after the first try fails, the server crashes. I can't say whether that is a change in the behaviour, but it is a new observation. I've attached gdb output for the case where the first ldapcompare is hanging. And, I've also attached the gdb analysis of the core dump. -- Iain Morgan
Attachment:
389-hang.txt.gz
Description: application/gunzip
Attachment:
389-coredump.txt.gz
Description: application/gunzip
-- 389 users mailing list 389-users@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/389-users