Restarting ocserv doesn't clean up all workers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2014-10-06 at 00:43 +0800, Niels Peen wrote:
> > On 05 Oct 2014, at 03:17, Nikos Mavrogiannopoulos <nmav at gnutls.org> wrote:
> > 
> > So, if I understand correctly, there was a user connection at some
> > point, which go stuck?
> 
> Yes. As far as I can tell these are worker processes that handle a user?s connection. At some point the user disconnects (or loses signal - many of the disconnects are unintentional) and the worker doesn?t get killed. Looking at today?s log It happens to about 1 in 400 workers.
> 
> > There are numerous places where this could occur. Would it be possible
> > to run:
> > $ gdb /usr/sbin/ocserv 21306
> > $ bt full
> Hope this helps:

It does, thank you. It seems we are in the case:
'Under Linux, select() may report a socket file descriptor as "ready for
reading", while nevertheless a subsequent read blocks.  This
could for example happen when data has arrived but upon examination
has wrong checksum and is discarded.  There may be other
circumstances in which a file descriptor is spuriously reported as
ready.  Thus it may be safer to use O_NONBLOCK on sockets that should
not block.'

So if the client disconnected and a packet with wrong checksum is
received, that block occurs, as ocserv depended on select() to check for
data. I've modified ocserv to use non-blocking sockets in master to
avoid that. It seems to work fine in my setup, but I'd like to have more
testing prior to a release.

regards,
Nikos





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux