On 11/06/2014 03:14 AM, Rich Megginson wrote: > On 11/06/2014 04:16 AM, Orion Poplawski wrote: >> Just recently we're seeing some very strange behavior on our system. >> Periodically we will see a sssd process start to have an ever greater number >> of connections to our ldap server until the server runs out of file >> descriptors. This seems to be happening with a particular user, who is >> having trouble logging in at times, particularly with email (dovecot). We >> see entries like the following on our sever: >> >> [05/Nov/2014:17:14:51 -0700] conn=1786153 op=0 EXT >> oid="1.3.6.1.4.1.1466.20037" name="startTLS" >> [05/Nov/2014:17:14:51 -0700] conn=1786153 op=0 RESULT err=0 tag=120 >> nentries=0 etime=0 >> [05/Nov/2014:17:14:51 -0700] conn=1786153 SSL 128-bit AES >> [05/Nov/2014:17:14:51 -0700] conn=1786153 op=1 BIND >> dn="uid=user,ou=People,dc=domain,dc=com" method=128 version=3 >> [05/Nov/2014:17:14:56 -0700] conn=1786153 op=2 ABANDON targetop=NOTFOUND >> msgid=2 >> [05/Nov/2014:17:14:56 -0700] conn=1786153 op=3 UNBIND >> [05/Nov/2014:17:14:56 -0700] conn=1786153 op=3 fd=1022 closed - U1 >> >> I don't yet have debug info from the sssd process. Any ideas from the above? >> >> Restarting the sssd process seems to clear things up for a while. >> >> - Orion >> > Try to reproduce the problem while using gdb to capture stack traces every few > seconds as in http://www.port389.org/docs/389ds/FAQ/faq.html#debugging-hangs > Ideally, we can get some stack traces of the server during the time between > the BIND and the ABANDON If I catch the problem early enough I can still get a stack trace. A series of them are in http://www.cora.nwra.com/~orion/ns-slapd-trace.tar.gz. Anything useful there? -- Orion Poplawski Technical Manager 303-415-9701 x222 NWRA, Boulder/CoRA Office FAX: 303-415-9702 3380 Mitchell Lane orion@xxxxxxxx Boulder, CO 80301 http://www.nwra.com -- 389 users mailing list 389-users@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/389-users