Re: shutdown not affecting connection?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi

BTW, my apologies but I should have specified the kernel I am running:
90206ac99c1f25b7f7a4c2c40a0b9d4561ffa9bf

On Sat, Feb 8, 2020 at 9:26 AM Pavel Begunkov <asml.silence@xxxxxxxxx> wrote:
>
> Hi
>
> On 2/8/2020 4:55 PM, Glauber Costa wrote:
> > Hi
> >
> > I've been trying to make sense of some weird behavior with the seastar
> > implementation of io_uring, and started to suspect a bug in io_uring's
> > connect.
> >
> > The situation is as follows:
> >
> > - A connect() call is issued (and in the backend I can choose if I use
> > uring or not)
> > - The connection is supposed to take a while to establish.
> > - I call shutdown on the file descriptor
> >
> > If io_uring is not used:
> > - connect() starts by  returning EINPROGRESS as expected, and after
> > the shutdown the file descriptor is finally made ready for epoll. I
> > call getsockopt(SOL_SOCKET, SO_ERROR), and see the error (104)
> >
> > if io_uring is used:
> > - if the SQE has the IOSQE_ASYNC flag on, connect() never returns.
> > - if the SQE *does not* have the IOSQE_ASYNC flag on, then most of the
> > time the test works as intended and connect() returns 104, but
> > occasionally it hangs too. Note that, seastar may choose not to call
> > io_uring_enter immediately and batch sqes.
> >
> > Sounds like some kind of race?
> >
> > I know C++ probably stinks like the devil for you guys, but if you are
> > curious to see the code, this fails one of our unit tests:
> >
> > https://github.com/scylladb/seastar/blob/master/tests/unit/connect_test.cc
> > See test_connection_attempt_is_shutdown
> > (above is the master seastar tree, not including the io_uring implementation)
> >
> Is this chaining with connect().then_wrapped() asynchronous? Like kind
> of future/promise stuff?

Correct.
then_wrapped executes eventually when connect returns either success or failure

> I wonder, if connect() and shutdown() there may
> be executed in the reverse order.

The methods connect and shutdown will execute in this order.
But connect will just queue something that will later be sent down to
the kernel.

I initially suspected an ordering issue on my side. What made me start
suspecting a bug
are two reasons:
- I can force the code to grab an sqe and call io_uring_enter at the
moment the connect()
call happens : I see no change.
- that IOSQE_ASYNC changes this behavior, as you acknowledged yourself.

It seems to me that if shutdown happens when the sqe is sitting on a
kernel queue somewhere
the connection will hang forever instead of failing right away as I would expect
- if shutdown happens after the call to io_uring_enter
>
> The hung with IOSQE_ASYNC sounds strange anyway.
>
>
> > Please let me know if this rings a bell and if there is anything I
> > should be verifying here
> >
>
> --
> Pavel Begunkov



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux