Re: Signals, sockets and threads

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Edwin Steiner wrote:
On Wed, Oct 18, 2006 at 03:52:50PM +0200, Robert Schuster wrote:

Hi all,


Hi!


the title looks like fun, eh? :)


Not really, if you spend some time with that stuff. ;)


In an attempt to get gnu/testlet/java/net/ServerSocket/ReturnOnClose to succeed
on Cacao with the new an shiny VMChannel implementation I found out the Cacao's
Thread.interrupt() does not cause blocking system calls to be interrupted. A


I found an even nastier problem (after debugging the whole day): If
one thread is blocking in a system call on a file descriptor (in my case
it is `accept`), and another thread closes this file descriptor, the
blocking call does not return.

What's even worse in the case of accept: The same file descriptor may
later be opened by another thread, for example by creating a socket on a
different port. Now an accept on this "new" file descriptor (the same fd
number) is started. Big problem: The _old_ accept call is still running,
and it is a race which of the accept calls will return when a connection
comes in.

This happens with CACAO if you run mauve tests like this:

    cacao Harness java.net.HttpURLConnection

The testlet gnu.testlet.java.net.HttpURLConnection.responseCodeTest
creates a server that returns "505" responses. The ServerSocket is
closed at the end of the test, but the following test
(gnu.testlet.java.net.HttpURLConnection.responseHeadersTest) gets the
same fd for its server, and since the old accept is still running, it is
a race about whether the test gets a "505" or a "200" response,
**even though the new server uses a different port**.

BTW google told me, that we are not the only ones having this problem:

    http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4344135

This is really a nasty case. I fear it requires special infrastructure
for `close` and blocking system calls, so the blocking threads are
interrupted, and the file descriptors get invalidated.

-Edwin


Many of these issues are noted in:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15430

The reuse of file descriptors was noted in a follow up comment there after I closed the bug.

As far as I know the problem of the accept() continuing to block after closing it in another thread goes, I think it is still fixed in libgcj. The classpath reference implementation may still be broken, if so someone should fix it.

I opened this PR: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29604

Yeah, I know it is against libgcj, but that is where I will fix it *if* I have time. Someone else will have to work on the classpath part.

David Daney



[Index of Archives]     [Linux Kernel]     [Linux Cryptography]     [Fedora]     [Fedora Directory]     [Red Hat Development]

  Powered by Linux