Hi On 2/8/2020 4:55 PM, Glauber Costa wrote: > Hi > > I've been trying to make sense of some weird behavior with the seastar > implementation of io_uring, and started to suspect a bug in io_uring's > connect. > > The situation is as follows: > > - A connect() call is issued (and in the backend I can choose if I use > uring or not) > - The connection is supposed to take a while to establish. > - I call shutdown on the file descriptor > > If io_uring is not used: > - connect() starts by returning EINPROGRESS as expected, and after > the shutdown the file descriptor is finally made ready for epoll. I > call getsockopt(SOL_SOCKET, SO_ERROR), and see the error (104) > > if io_uring is used: > - if the SQE has the IOSQE_ASYNC flag on, connect() never returns. > - if the SQE *does not* have the IOSQE_ASYNC flag on, then most of the > time the test works as intended and connect() returns 104, but > occasionally it hangs too. Note that, seastar may choose not to call > io_uring_enter immediately and batch sqes. > > Sounds like some kind of race? > > I know C++ probably stinks like the devil for you guys, but if you are > curious to see the code, this fails one of our unit tests: > > https://github.com/scylladb/seastar/blob/master/tests/unit/connect_test.cc > See test_connection_attempt_is_shutdown > (above is the master seastar tree, not including the io_uring implementation) > Is this chaining with connect().then_wrapped() asynchronous? Like kind of future/promise stuff? I wonder, if connect() and shutdown() there may be executed in the reverse order. The hung with IOSQE_ASYNC sounds strange anyway. > Please let me know if this rings a bell and if there is anything I > should be verifying here > -- Pavel Begunkov