On Sat, Feb 8, 2020 at 3:29 PM Avi Kivity <avi@xxxxxxxxxxxx> wrote: > > On 2/8/20 10:20 PM, Glauber Costa wrote: > >> > >>> Perhaps you can reduce the > >>> problem to a small C reproducer? > >>> > >> That was my intended next step, yes > > s***, I didn't resist and I had to explain to my wife that no, I don't > > like io_uring more than I like her. > > > > But here it is. > > > > This is a modification of test/connect.c. > > I added a pthread comparison example that should achieve the same > > sequence of events: > > - try to sync connect > > - wait a bit > > - shutdown > > > > I added a fixed wait for pthread to make sure that shutdown is not > > called before connect. > > > > For io_uring, the shutdown is configurable with the program argument. > > This works just fine if I sleep before shutdown (as I would expect from a race). > > This hangs every time if I don't. > > > > Unless I am missing something I don't think this is the expected behavior > > > I think it is understandable. Since the socket is blocking uring moves > the work to a workqueue, and the shutdown() happens before the workqueue > has had a chance to process the connection attempt. So we'll have to > cancel the sqe. It does seem to me that this implies that every shutdown must imply a cancel to a connection. >From the user's perspective, this still feels like a bug to me: the fact that we had to move this to a work queue is an implementation detail: 1) we asked the kernel to do something 2) the kernel returned 3) we called shutdown() to expecting that cancel to go away and never returned. If cancel-after-connect to avoid these races is the intended behavior, it would be nice to get this documented somehow in the io_uring fantastic documentation. In hindsight, cancel-on-shutdown is quite obvious and natural. But I just spent two days to make this obvious and natural. > > > Jens, does the blocking connect doesn't consume a kernel thread while > it's waiting for a connection? Or does it just set things up and move on? >