On Mon, 2014-10-06 at 00:43 +0800, Niels Peen wrote: > > On 05 Oct 2014, at 03:17, Nikos Mavrogiannopoulos <nmav at gnutls.org> wrote: > > > > So, if I understand correctly, there was a user connection at some > > point, which go stuck? > > Yes. As far as I can tell these are worker processes that handle a user?s connection. At some point the user disconnects (or loses signal - many of the disconnects are unintentional) and the worker doesn?t get killed. Looking at today?s log It happens to about 1 in 400 workers. > > > There are numerous places where this could occur. Would it be possible > > to run: > > $ gdb /usr/sbin/ocserv 21306 > > $ bt full > Hope this helps: It does, thank you. It seems we are in the case: 'Under Linux, select() may report a socket file descriptor as "ready for reading", while nevertheless a subsequent read blocks. This could for example happen when data has arrived but upon examination has wrong checksum and is discarded. There may be other circumstances in which a file descriptor is spuriously reported as ready. Thus it may be safer to use O_NONBLOCK on sockets that should not block.' So if the client disconnected and a packet with wrong checksum is received, that block occurs, as ocserv depended on select() to check for data. I've modified ocserv to use non-blocking sockets in master to avoid that. It seems to work fine in my setup, but I'd like to have more testing prior to a release. regards, Nikos