On Wed, 2016-01-06 at 23:38 +0100, Nikos Mavrogiannopoulos wrote: > On Wed, 2016-01-06 at 19:23 +0800, Yick Xie wrote: > > Hi Nikos, > > Speaking of the devil, it comes. I just rechecked the server and > > found > > the problem I mentioned in the first thread. The situation is far > > too > > complicated than I imagined. This user got 2 stall session in this > > ocserv instance, but only today's one was not set acct-stop-time in > > the radius SQL. So I only had the today's log, sorry. As for the > > record, the radius server only got messages until today 14:20, yet > > actually the session was still active before 16:04:17, then again > > nothing changed in the radius server. The radius server and ocserv > > were deployed in one server. > > Thanks. It seems that the worker is blocked on recv(). I suppose that > is Linux which its select() manpage says: > > "Under Linux, select() may report a socket file descriptor as "ready > for reading", while nevertheless a subsequent read blocks. This > could for example happen when data has arrived but upon examination > has > wrong checksum and is discarded. There may be other circumstances > in > which a file descriptor is spuriously reported as ready. Thus it may > be safer to use O_NON?BLOCK on sockets that should not block." I'm no longer sure that's the reason. The socket is already in non blocking mode, and it seems that the code in question calls: recv(GNUTLS_POINTER_TO_INT(ptr), data, data_size, 0); However in your debugging output there is: #0 0x00007f9d0a80538d in __libc_recv (fd=0, buf=0x118ea96, n=154, flags=-1) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:29 This flags=-1 is quite strange.