I am seeking some assistance in isolating a
re-occurring problem we are experiencing with our 389 DS
Version 1.2.8.3 installation. We use the directory server for
user authentication to our website. Every couple of days we
start getting errors from our website login application
reporting a user authentication timed out. These timeouts get
more frequent as time passes. Our fix now is to restart the
directory server which fixes the problem for a couple of days
then the timeouts start happening again. I traced one
application timeout back to the ds access logs and found the
following entry at the same time:
[14/Mar/2012:10:23:01 -0500] conn=14730
op=-1 fd=1093 closed error 104 (Connection reset by peer) -
TCP connection reset by peer.
I looked through the older logs and the
only time this conn/fd was used was two days ago. Here are the
access log entries:
[12/Mar/2012:14:33:06 -0500] conn=14730
fd=1093 slot=1093 connection from 10.1.xx.xx to 10.1.xx.xx
[12/Mar/2012:14:33:06 -0500] conn=14730
op=0 BIND dn="uid,dc=domain,dc=com" method=128 version=3
[12/Mar/2012:14:33:06 -0500] conn=14730
op=0 RESULT err=0 tag=97 nentries=0 etime=0
dn="uid=theManager,dc=domain,dc=com"
[12/Mar/2012:14:33:06 -0500] conn=14730
op=1 SRCH base="ou=users,ou=external,dc=domain,dc=com" scope=2
filter="(&(uid=xxxxx)(objectClass=inetUser))" attrs="1.1"
[12/Mar/2012:14:33:06 -0500] conn=14730
op=1 RESULT err=0 tag=101 nentries=1 etime=0
[12/Mar/2012:14:33:06 -0500] conn=14730
op=2 BIND dn="uid=xxxxx,ou=users,ou=external,dc=domain,dc=com"
method=128 version=3
[12/Mar/2012:14:33:06 -0500] conn=14730
op=2 RESULT err=0 tag=97 nentries=0 etime=0
dn="uid=xxxxx,ou=users,ou=external,dc=domain,dc=com"
[12/Mar/2012:14:33:06 -0500] conn=14730
op=3 BIND dn="uid,dc=domain,dc=com" method=128 version=3
[12/Mar/2012:14:33:06 -0500] conn=14730
op=3 RESULT err=0 tag=97 nentries=0 etime=0
dn="uid=theManager,dc= domain,dc=com"
[12/Mar/2012:14:35:20 -0500] conn=14730
op=4 SRCH base="ou=groups,ou=external,dc= domain,dc=com"
scope=2
filter="(&(cn=domain)(|(objectClass=groupOfURLs)(objectClass=groupOfNames)))"
attrs="1.1"
[12/Mar/2012:14:35:20 -0500] conn=14730
op=4 RESULT err=0 tag=101 nentries=1 etime=0
[12/Mar/2012:14:35:20 -0500] conn=14730
op=5 SRCH base="ou=groups,ou=external,dc= domain,dc=com"
scope=2
filter="(&(member=cn=domain,ou=groups,ou=external,dc=domain,dc=com)(objectClass=groupOfNames))"
attrs="cn"
[12/Mar/2012:14:35:20 -0500] conn=14730
op=5 RESULT err=0 tag=101 nentries=0 etime=0
[12/Mar/2012:14:36:50 -0500] conn=14730
op=6 SRCH base="ou=users,ou=external,dc=domain,dc=com" scope=2
filter="(&(uid=xxxxxx)(objectClass=inetUser))" attrs="1.1"
[12/Mar/2012:14:36:50 -0500] conn=14730
op=6 RESULT err=0 tag=101 nentries=1 etime=0
[12/Mar/2012:14:36:50 -0500] conn=14730
op=7 BIND
dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=com"
method=128 version=3
[12/Mar/2012:14:36:50 -0500] conn=14730
op=7 RESULT err=0 tag=97 nentries=0 etime=0
dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=com"
[12/Mar/2012:14:36:50 -0500] conn=14730
op=8 BIND dn="uid=theManager,dc=domain,dc=com" method=128
version=3
[12/Mar/2012:14:36:50 -0500] conn=14730
op=8 RESULT err=0 tag=97 nentries=0 etime=0
dn="uid=theManager,dc=domain,dc=com"
[12/Mar/2012:14:37:02 -0500] conn=14730
op=9 SRCH base="ou=groups,ou=external,dc=domain,dc=com"
scope=2
filter="(&(cn=domain)(|(objectClass=groupOfURLs)(objectClass=groupOfNames)))"
attrs="1.1"
[12/Mar/2012:14:37:02 -0500] conn=14730
op=9 RESULT err=0 tag=101 nentries=1 etime=0
[12/Mar/2012:14:37:02 -0500] conn=14730
op=10 SRCH base="ou=groups,ou=external,dc=domain,dc=com"
scope=2
filter="(&(member=cn=domain,ou=groups,ou=external,dc=domain,dc=com)(objectClass=groupOfNames))"
attrs="cn"
[12/Mar/2012:14:37:02 -0500] conn=14730
op=10 RESULT err=0 tag=101 nentries=0 etime=0
[12/Mar/2012:14:39:35 -0500] conn=14730
op=11 SRCH base="ou=users,ou=external,dc=domain,dc=com"
scope=2 filter="(&(uid=xxxxxx)(objectClass=inetUser))"
attrs="1.1"
[12/Mar/2012:14:39:35 -0500] conn=14730
op=11 RESULT err=0 tag=101 nentries=1 etime=0
[12/Mar/2012:14:39:35 -0500] conn=14730
op=12 BIND
dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=com"
method=128 version=3
[12/Mar/2012:14:39:35 -0500] conn=14730
op=12 RESULT err=0 tag=97 nentries=0 etime=0
dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=com"
[12/Mar/2012:14:39:35 -0500] conn=14730
op=13 BIND dn="uid=theManager,dc=domain,dc=com" method=128
version=3
[12/Mar/2012:14:39:35 -0500] conn=14730
op=13 RESULT err=0 tag=97 nentries=0 etime=0
dn="uid=theManager,dc=domain,dc=com"
[12/Mar/2012:14:40:23 -0500] conn=14730
op=14 SRCH base="ou=groups,ou=external,dc=domain,dc=com"
scope=2
filter="(&(cn=domain)(|(objectClass=groupOfURLs)(objectClass=groupOfNames)))"
attrs="1.1"
[12/Mar/2012:14:40:23 -0500] conn=14730
op=14 RESULT err=0 tag=101 nentries=1 etime=0
[12/Mar/2012:14:40:23 -0500] conn=14730
op=15 SRCH base="ou=groups,ou=external,dc=domain,dc=com"
scope=2
filter="(&(member=cn=domain,ou=groups,ou=external,dc=domain,dc=com)(objectClass=groupOfNames))"
attrs="cn"
[12/Mar/2012:14:40:23 -0500] conn=14730
op=15 RESULT err=0 tag=101 nentries=0 etime=0
The scenario seems to be that the DS works
fine after a restart until it runs out of unused connections
and/or file descriptors (max FDs= 8192). When it starts
recycling connections and/or file descriptors the 104 errors
start appearing more often in the access logs and we start
getting more authentication errors. We suspect that the
original connection never got terminated correctly but don’t
know if it is the application that is at fault or a DS
setting.