Re: [PATCH 3/3] io_uring: refactor io_sq_offload_create()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/23/21 11:11 AM, Al Viro wrote:
> On Fri, Jul 23, 2021 at 10:17:27AM -0600, Jens Axboe wrote:
>> On 7/22/21 6:10 PM, Al Viro wrote:
>>> On Thu, Jul 22, 2021 at 05:42:55PM -0600, Jens Axboe wrote:
>>>
>>>>> So how can we possibly get there with tsk->files == NULL and what does it
>>>>> have to do with files, anyway?
>>>>
>>>> It's not the clearest, but the files check is just to distinguish between
>>>> exec vs normal cancel. For exec, we pass in files == NULL. It's not
>>>> related to task->files being NULL or not, we explicitly pass NULL for
>>>> exec.
>>>
>>> Er...  So turn that argument into bool cancel_all, and pass false on exit and
>>> true on exec? 
>>
>> Yes
>>
>>> While we are at it, what happens if you pass io_uring descriptor
>>> to another process, close yours and then have the recepient close the one it
>>> has gotten?  AFAICS, io_ring_ctx_wait_and_kill(ctx) will be called in context
>>> of a process that has never done anything io_uring-related.  Can it end up
>>> trying to resubmit some requests?> 
>>> I rather hope it can't happen, but I don't see what would prevent it...
>>
>> No, the pending request would either have gone to a created thread of
>> the original task on submission, or it would be sitting in a
>> ready-to-retry state. The retry would attempt to queue to original task,
>> and either succeed (if still alive) or get failed with -ECANCELED. Any
>> given request is tied to the original task.
> 
> Hmm...  Sure, you'll be pushing it to the same io_wqe it went through originally,
> but you are still in context of io_uring_release() caller, aren't you?
> 
> So you call io_wqe_wake_worker(), and it decides that all threads are busy,
> but ->nr_workers is still below ->max_workers.  And proceeds to
> 	create_io_worker(wqe->wq, wqe, acct->index);
> which will create a new io-worker thread, but do that in the thread group of
> current, i.e. the caller of io_uring_release().  Looks like we'd get
> an io-worker thread with the wrong parent...
> 
> What am I missing here?

I think there's two main cases here:

1) Request has already been issued by original task in some shape or form.
   This is the case I was mentioning in my previous reply.

2) Request has not been seen yet, this would be a new submit.

For case #2, let's say you pass the ring to another task, there are entries
in the SQ ring that haven't been submitted yet. Will these go under the new
task? Yes they would - you've passed the ring to someone else at that point.

For case #1, by definition the request has already been issued and is
assigned to the task that did that. This either happens from the syscall
itself, or from task_work which is also run from that original task.

For your particular case, it's either an original async queue (hasn't
been done on this task before), in which case it will create a thread
from the right task. The other option is that we decide to async requeue
from async context for some odd reason, and we're already in the right
context at that point to create a new thread (which should not even
happen, as the same thread is now available).

I don't see a case where this wouldn't work as expected. However, I do
think we could add a WARN_ON_ONCE (or similar) and reject any attempt
to io_wq_enqueue() from the wrong context as a proactive measure to
catch any bugs in testing rather than later.

Outside of that, we're not submitting off release, only killing anything
pending. The only odd case there is iopoll, but that doesn't resubmit
here.

-- 
Jens Axboe




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux