Re: [PATCH 3/7] io_uring: don't wait when under-submitting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/17/19 4:55 PM, Jann Horn wrote:
> On Tue, Dec 17, 2019 at 11:54 PM Jens Axboe <axboe@xxxxxxxxx> wrote:
>> There is no reliable way to submit and wait in a single syscall, as
>> io_submit_sqes() may under-consume sqes (in case of an early error).
>> Then it will wait for not-yet-submitted requests, deadlocking the user
>> in most cases.
>>
>> In such cases adjust min_complete, so it won't wait for more than
>> what have been submitted in the current io_uring_enter() call. It
>> may be less than total in-flight, but that up to a user to handle.
>>
>> Signed-off-by: Pavel Begunkov <asml.silence@xxxxxxxxx>
>> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
> [...]
>>         if (flags & IORING_ENTER_GETEVENTS) {
>>                 unsigned nr_events = 0;
>>
>>                 min_complete = min(min_complete, ctx->cq_entries);
>> +               if (submitted != to_submit)
>> +                       min_complete = min(min_complete, (u32)submitted);
> 
> Hm. Let's say someone submits two requests, first an ACCEPT request
> that might stall indefinitely and then a WRITE to a file on disk that
> is expected to complete quickly; and the caller uses min_complete=1
> because they want to wait for the WRITE op. But now the submission of
> the WRITE fails, io_uring_enter() computes min_complete=min(1, 1)=1,
> and it blocks on the ACCEPT op. That would be bad, right?
> 
> If the usecase I described is valid, I think it might make more sense
> to do something like this:
> 
> u32 missing_submissions = to_submit - submitted;
> min_complete = min(min_complete, ctx->cq_entries);
> if ((flags & IORING_ENTER_GETEVENTS) && missing_submissions < min_complete) {
>   min_complete -= missing_submissions;
>   [...]
> }
> 
> In other words: If we do a partially successful submission, only wait
> as long as we know that userspace definitely wants us to wait for one
> of the pending requests; and once we can't tell whether userspace
> intended to wait longer, return to userspace and let the user decide.
> 
> Or it might make sense to just ignore IORING_ENTER_GETEVENTS
> completely in the partial submission case, in case userspace wants to
> immediately react to the failed request by writing out an error
> message to a socket or whatever. This case probably isn't
> performance-critical, right? And it would simplify things a bit.

That's a good point, and Pavel's first patch actually did that. I
didn't consider the different request type case, which might be
uncommon but definitely valid.

Probably the safest bet here is just to not wait at all if we fail
submitting all of them. This isn't a fast path, there was an error
somehow which meant we didn't submit it all. So just return the
submit count (including 0, not -EAGAIN) if we fail submitting,
and ignore IORING_ENTER_GETEVENTS for that case.

Pavel, care to submit a new one? I'll drop this one now.

-- 
Jens Axboe




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux