On Fri, 2014-09-19 at 18:57 +0200, Thomas Veerman wrote: > Nikos Mavrogiannopoulos schreef op 2014-09-19 17:14: > > On Fri, Sep 19, 2014 at 5:07 PM, Nikos Mavrogiannopoulos > > <n.mavrogiannopoulos at gmail.com> wrote: > >>>> Are these discussions public? Does that change solve the issue? Is > >>>> the > >>>> loop on dnsmasq on tun or udp socket? > >>> The problem has not re-occured since, but I won't be able to say > >>> with confidence until some more > >>> time has passed. > >> If that solves the issue, we'd better wrap all close calls in > >> ocserv's > >> main thread. > > > > Having read: > > http://lwn.net/Articles/576478/ > > https://lkml.org/lkml/2002/7/17/165 > > > > it gets more confusing. As I understand close shouldn't have been > > interrupted in linux. > > Interesting. I'm totally in favor of close(2) having a policy of never > failing except for EBADF, but that's not what the man page says (Ubuntu > 14.04.1). However, the close(2) wrapper was not the main goal of the > patch. I was looking for a potential reason why tun devices weren't > properly closed and came across the handle_script_exit function that > contains the comment: > /* we close the lease tun fd both on success and failure. > * The parent doesn't need to keep the tunfd, and if it does, > * it causes issues to client. > */ > And then proceeds to close the tun fd only if proc->tun_lease.name > doesn't contain an empty string. The patch removes that condition. Hmm, tun_lease.name should always have a name when the device is opened. The name is returned by the TUNSETIFF ioctl(), which I suppose if succeed it should return a valid name. I'll add a check there to be sure, but whether tun devices always have name can be seen at syslog (assigning tun device ...). > I > added the close(2) wrapper as an extra measure, but I now realize that's > unnecessary (on Linux at least). Here it gets interesting. > On Fri, Sep 19, 2014 at 2:54 PM, Niels Peen <niels at peen.ch> wrote: > 10.255.232.69 is a client of ocserv that disconnected prior to dnsmasq > returning the result. Removing ocserv from the server (but letting > people connect with other VPN methods) prevents the problem from > occurring. Is anything related in the ocserv logs about this client? Does his device exist? Any other kernel related messages that could help? regards, Nikos