Re: [PATCHSET 0/2] io_uring support for linked timeouts

Jens Axboe <axboe@xxxxxxxxx> · Fri, 15 Nov 2019 08:13:42 -0700

On 11/15/19 7:21 AM, Pavel Begunkov wrote:
> On 15/11/2019 12:40, Pavel Begunkov wrote:
>>>> Finally got to this patch. I think, find it adding too many edge cases
>>>> and it isn't integrated consistently into what we have now. I would love
>>>> to hear your vision, but I'd try to implement them in such a way, that it
>>>> doesn't need to modify the framework, at least for some particular case.
>>>> In other words, as opcodes could have been added from the outside with a
>>>> function table.
>>>
>>> I agree, it could do with a bit of cleanup. Incrementals would be
>>> appreciated!
>>>
>>>> Also, it's not so consistent with the userspace API as well.
>>>>
>>>> 1. If we specified drain for the timeout, should its start be delayed
>>>> until then? I would prefer so.
>>>>
>>>> E.g. send_msg + drained linked_timeout, which would set a timeout from the
>>>> start of the send.
>>>
>>> What cases would that apply to, what would the timeout even do in this
>>> case? The point of the linked timeout is to abort the previous command.
>>> Maybe I'm not following what you mean here.
>>>
>> Hmm, got it a bit wrong with defer. io_queue_link_head() can defer it
>> without setting timeout. However, it seems that io_wq_submit_work()
>> won't set a timer, as it uses __io_submit_sqe(), but not
>> __io_queue_sqe(), which handles all this with linked timeouts.
>>
>> Indeed, maybe it be, that you wanted to place it in __io_submit_sqe?
>>
>>>> 2. Why it could be only the second one in a link? May we want to cancel
>>>> from a certain point?
>>>> e.g. "op1 -> op2 -> timeout -> op3" cancels op2 and op3
>>>
>>> Logically it need not be the second, it just has to follow another
>>> request. Is there a bug there?
>>>
>> __io_queue_sqe looks only for the second one in a link. Other linked
>> timeouts will be ignored, if I get the code right.
>>
>> Also linking may (or __may not__) be an issue. As you remember, the head
>> is linked through link_list, and all following with list.
>> i.e. req_head.link_list <-> req.list <-> req.list <-> req.list
>>
>> free_req() (last time I saw it), expects that timeout's previous request
>> is linked with link_list. If a timeout can fire in the middle of a link
>> (during execution), this could be not the case. But it depends on when
>> we set an timeout.
>>
>> BTW, personally I'd link them all through link_list. E.g. may get rid of
>> splicing in free_req(). I'll try to make it later.
>>
>>>> 3. It's a bit strange, that the timeout affects a request from the left,
>>>> and after as an consequence cancels everything on the right (i.e. chain).
>>>> Could we place it in the head? So it would affect all requests on the right
>>>> from it.
>>>
>>> But that's how links work, though. If you keep linking, then everything
>>> that depends on X will fail, if X itself isn't succesful.
>>>
>> Right. That's about what userspace API would be saner. To place timeout
>> on the left of a request, or on the right, with the same resulting effect.
>>
>> Let put this question away until the others are clear.
>>
>>>> 4. I'd prefer to handle it as a new generic command and setting a timer
>>>> in __io_submit_sqe().
>>>>
>>>> I believe we can do it more gracefully, and at the same moment giving
>>>> more freedom to the user. What do you think?
>>>
>>> I just think we need to make sure the ground rules are sane. I'm going
>>> to write a few test cases to make sure we do the right thing.
>>>
>>
> Ok, let me try to state some rules to discuss:

> 1. REQ -> LINK_TIMEOUT
> is a valid use case

Yes

> 2. timeout is set at the moment of starting execution of operation.
> e.g. REQ1, REQ2|DRAIN -> LINK_TIMEOUT
>
> Timer is set at the moment, when everything is drained and we
> sending REQ. i.e. after completion of REQ1

Right, the timeout is prepped before REQ2 is started, armed when it is
started (if not already done). The prep + arm split is important to
ensure that a short timeout doesn't even find REQ2.

> 3. REQ1 -> LINK_TIMEOUT1 -> REQ2 -> LINK_TIMEOUT2
> 
> is valid, and LINK_TIMEOUT2 will be set, at the moment of
> start of REQ2's execution. It also mean, that if
> LINK_TIMEOUT1 fires, it will cancel REQ1, and REQ2
> with LINK_TIMEOUT2 (with proper return values)

That's not valid with the patches I sent. It could be, but we'd need to
fix that bit.

> 4. REQ1, LINK_TIMEOUT
> is invalid, fail it

Correct

> 5. LINK_TIMEOUT1 -> LINK_TIMEOUT2
> Fail first, link-fail (aka cancelled) for the second one

Correct

> 6. REQ1 -> LINK_TIMEOUT1 -> LINK_TIMEOUT2
> execute REQ1+LINK_TIMEOUT1, and then fail LINK_TIMEOUT2 as
> invalid. Also, LINK_TIMEOUT2 could be just cancelled
> (e.g. if fail_links for REQ1)

Given case 5, why would this one be legal?

-- 
Jens Axboe