Re: [PATCH 7/8] io_uring: fix sequencing issues with linked timeouts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/20/19 5:42 AM, Pavel Begunkov wrote:
> On 11/20/2019 1:13 AM, Jens Axboe wrote:
>> On 11/19/19 1:51 PM, Pavel Begunkov wrote:
>>> On 16/11/2019 04:53, Jens Axboe wrote:
>>>> We have an issue with timeout links that are deeper in the submit chain,
>>>> because we only handle it upfront, not from later submissions. Move the
>>>> prep + issue of the timeout link to the async work prep handler, and do
>>>> it normally for non-async queue. If we validate and prepare the timeout
>>>> links upfront when we first see them, there's nothing stopping us from
>>>> supporting any sort of nesting.
>>>>
>>>> Fixes: 2665abfd757f ("io_uring: add support for linked SQE timeouts")
>>>> Reported-by: Pavel Begunkov <asml.silence@xxxxxxxxx>
>>>> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
>>>> ---
>>>
>>>> @@ -923,6 +942,7 @@ static void io_fail_links(struct io_kiocb *req)
>>>>    			io_cqring_fill_event(link, -ECANCELED);
>>>>    			__io_double_put_req(link);
>>>>    		}
>>>> +		kfree(sqe_to_free);
>>>>    	}
>>>>    
>>>>    	io_commit_cqring(ctx);
>>>> @@ -2668,8 +2688,12 @@ static void io_wq_submit_work(struct io_wq_work **workptr)
>>>>    
>>>>    	/* if a dependent link is ready, pass it back */
>>>>    	if (!ret && nxt) {
>>>> -		io_prep_async_work(nxt);
>>>> +		struct io_kiocb *link;
>>>> +
>>>> +		io_prep_async_work(nxt, &link);
>>>>    		*workptr = &nxt->work;
>>> Are we safe here without synchronisation?
>>> Probably io_link_timeout_fn() may miss the new value
>>> (doing io-wq cancel).
>>
>> Miss what new value? Don't follow that part.
>>
> 
> As I've got the idea of postponing:
> at the moment of io_queue_linked_timeout(), a request should be either
> in io-wq or completed. So, @nxt->work after the assignment above should
> be visible to asynchronously called io_wq_cancel_work().
> 
>>>>   *workptr = &nxt->work;
> However, there is no synchronisation for this assignment, and it could
> be not visible from a parallel thread. Is it somehow handled in io-wq?
> 
> The pseudo code is below (th1, th2 - parallel threads)
> th1: *workptr = &req->work;
> // non-atomic assignment, the new value of workptr (i.e. &req->work)
> // isn't yet propagated to th2
> 
> th1: io_queue_linked_timeout()
> th2: io_linked_timeout_fn(), calls io_wq_cancel_work(), @req not found
> th2: // memory model finally propagated *workptr = &req->work to @th2
> 
> 
> Please, let me know if that's also not clear.

OK, so I see what you're saying, but I don't think it's missing locking.
There is, however, a gap where we won't be able to find the request.
What we need is a way to assign the io-wq current work before we call
io_queue_linked_timeout(). Something ala:

	io_prep_async_work(nxt, &link);
	*workptr = &nxt->work;
+	io_wq_assign_cur();
	if (link)
		io_queue_linked_timeout(link);

where io_wq_assign_cur() ensures that worker->cur_work is set to the new
work, so we know it's discoverable before calling
io_queue_linked_timeout(). Probably also needs to include the
->get_work() call as part of that, so moving the logic around a bit in
io_worker_handle_work().

If we do that, then by the time we arm the linked timer, we know we'll
be able to find the new work item. The old work is done at this point
anyway, so doing this a bit earlier is fine.

-- 
Jens Axboe




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux