Re: FDS 1.1 Transport endpoint is not connected

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Richard Hesse wrote:
Not much new to report. The server hung again and the only thing in the
error log with connection tracing is this:

[18/Feb/2008:13:14:03 +0000] - PR_Write(41818752) Netscape Portable Runtime
error -5961 (TCP connection reset by peer.)
[18/Feb/2008:13:14:03 +0000] - ber_flush failed, error 104 (Connection reset
by peer)

Which doesn't look like much.
Well, it tells me that the server was attempting to write to a socket, and got an error. -5961 is PR_CONNECT_RESET_ERROR which can occur if the system call returns either EPIPE or ECONNRESET. And error 104 is indeed ECONNRESET. /usr/include/asm-generic/errno.h:#define ECONNRESET 104 /* Connection reset by peer */

AFAICT, this can happen if the client shuts down the socket (for any number of reasons) but the server is still attempting to send data. In this case, the client will respond with a TCP RST. I'm not sure how or why this could happen. I'm open to other causes for ECONNRESET. What would be really, really interesting is if we could narrow this down to a particular client application and run ethereal on the connection.

Are you using SSL?
As for network tuning, it's already been done.
Max descriptors is set to 32768.

Are there any gdb commands I can run while the server is in a hung state?
Sure. For whatever the cause of the ECONNRESET, it should not cause the server to hang, and it would be interesting to find out what it's doing. You'll have to install the fedora-ds-base-debuginfo package.
Attach to the process - gdb /usr/sbin/ns-slapd <pid of process>
Then, dump the thread stacks -

(gdb) thread apply all bt

If you want the output to go to a file, redirect gdb logging to a file first before doing the thread apply e.g.

(gdb) set logging on
(gdb) set logging file stack.txt


I'm going to try running strace while the process is working, and hope for a
hang. Maybe that will give us some more info.

-richard

On 2/19/08 10:23 AM, "Rich Megginson" <rmeggins@xxxxxxxxxx> wrote:

Richard Hesse wrote:
Yes, every host (except the ldap hosts) runs nscd. The ldap servers are not
configured to use directory data for anything.

I just don't know.  I've not seen this before.  I suppose you could try
checking your kernel TCP/IP settings, and increasing the number of file
descriptors used -
http://directory.fedoraproject.org/wiki/Performance_Tuning
-richard


On 2/15/08 2:11 PM, "Rich Megginson" <rmeggins@xxxxxxxxxx> wrote:


Richard Hesse wrote:

nsswitch posix users/groups,

Are you using nscd?

ssh, sudo, puppet (config management), and
internally written applications.

-richard

On 2/15/08 12:53 PM, "Rich Megginson" <rmeggins@xxxxxxxxxx> wrote:



What is the application which is generating this load?


--
Fedora-directory-users mailing list
Fedora-directory-users@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/fedora-directory-users


--
Fedora-directory-users mailing list
Fedora-directory-users@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/fedora-directory-users



--
Fedora-directory-users mailing list
Fedora-directory-users@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/fedora-directory-users

<<attachment: smime.p7s>>

--
Fedora-directory-users mailing list
Fedora-directory-users@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/fedora-directory-users

[Index of Archives]     [Fedora Directory Users]     [Fedora Directory Devel]     [Fedora Announce]     [Fedora Legacy Announce]     [Kernel]     [Fedora Legacy]     [Share Photos]     [Fedora Desktop]     [PAM]     [Red Hat Watch]     [Red Hat Development]     [Big List of Linux Books]     [Gimp]     [Yosemite News]

  Powered by Linux