Re: [PATCH for-next] io_uring: ensure IOSQE_ASYNC file table grabbing works, with SQPOLL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/09/2020 21:18, Jens Axboe wrote:
> On 9/10/20 7:11 AM, Jens Axboe wrote:
>> On 9/10/20 6:37 AM, Pavel Begunkov wrote:
>>> On 09/09/2020 19:07, Jens Axboe wrote:
>>>> On 9/9/20 9:48 AM, Pavel Begunkov wrote:
>>>>> On 09/09/2020 16:10, Jens Axboe wrote:
>>>>>> On 9/9/20 1:09 AM, Pavel Begunkov wrote:
>>>>>>> On 09/09/2020 01:54, Jens Axboe wrote:
>>>>>>>> On 9/8/20 3:22 PM, Jens Axboe wrote:
>>>>>>>>> On 9/8/20 2:58 PM, Pavel Begunkov wrote:
>>>>>>>>>> On 08/09/2020 20:48, Jens Axboe wrote:
>>>>>>>>>>> Fd instantiating commands like IORING_OP_ACCEPT now work with SQPOLL, but
>>>>>>>>>>> we have an error in grabbing that if IOSQE_ASYNC is set. Ensure we assign
>>>>>>>>>>> the ring fd/file appropriately so we can defer grab them.
>>>>>>>>>>
>>>>>>>>>> IIRC, for fcheck() in io_grab_files() to work it should be under fdget(),
>>>>>>>>>> that isn't the case with SQPOLL threads. Am I mistaken?
>>>>>>>>>>
>>>>>>>>>> And it looks strange that the following snippet will effectively disable
>>>>>>>>>> such requests.
>>>>>>>>>>
>>>>>>>>>> fd = dup(ring_fd)
>>>>>>>>>> close(ring_fd)
>>>>>>>>>> ring_fd = fd
>>>>>>>>>
>>>>>>>>> Not disagreeing with that, I think my initial posting made it clear
>>>>>>>>> it was a hack. Just piled it in there for easier testing in terms
>>>>>>>>> of functionality.
>>>>>>>>>
>>>>>>>>> But the next question is how to do this right...> 
>>>>>>>> Looking at this a bit more, and I don't necessarily think there's a
>>>>>>>> better option. If you dup+close, then it just won't work. We have no
>>>>>>>> way of knowing if the 'fd' changed, but we can detect if it was closed
>>>>>>>> and then we'll end up just EBADF'ing the requests.
>>>>>>>>
>>>>>>>> So right now the answer is that we can support this just fine with
>>>>>>>> SQPOLL, but you better not dup and close the original fd. Which is not
>>>>>>>> ideal, but better than NOT being able to support it.
>>>>>>>>
>>>>>>>> Only other option I see is to to provide an io_uring_register()
>>>>>>>> command to update the fd/file associated with it. Which may be useful,
>>>>>>>> it allows a process to indeed to this, if it absolutely has to.
>>>>>>>
>>>>>>> Let's put aside such dirty hacks, at least until someone actually
>>>>>>> needs it. Ideally, for many reasons I'd prefer to get rid of
>>>>>>
>>>>>> BUt it is actually needed, otherwise we're even more in a limbo state of
>>>>>> "SQPOLL works for most things now, just not all". And this isn't that
>>>>>> hard to make right - on the flush() side, we just need to park/stall the
>>>>>
>>>>> I understand that it isn't hard, but I just don't want to expose it to
>>>>> the userspace, a) because it's a userspace API, so couldn't probably be
>>>>> killed in the future, b) works around kernel's problems, and so
>>>>> shouldn't really be exposed to the userspace in normal circumstances.
>>>>>
>>>>> And it's not generic enough because of a possible "many fds -> single
>>>>> file" mapping, and there will be a lot of questions and problems.
>>>>>
>>>>> e.g. if a process shares a io_uring with another process, then
>>>>> dup()+close() would require not only this hook but also additional
>>>>> inter-process synchronisation. And so on.
>>>>
>>>> I think you're blowing this out of proportion. Just to restate the
>>>
>>> I just think that if there is a potentially cleaner solution without
>>> involving userspace, we should try to look for it first, even if it
>>> would take more time. That was the point.
>>
>> Regardless of whether or not we can eliminate that need, at least it'll
>> be a relaxing of the restriction, not an increase of it. It'll never
>> hurt to do an extra system call for the case where you're swapping fds.
>> I do get your point, I just don't think it's a big deal.
> 
> BTW, I don't see how we can ever get rid of a need to enter the kernel,
> we'd need some chance at grabbing the updated ->files, for instance.

Thanks for taking a look.
Yeah, agree, it should get it from somewhere, and that reminds me that
we have a similar situation with sqo_mm -- it grabs it from the
task-creator and keeps it to the end... Do we really need to set
->files of another thread? Retaining to the end seem to work well
enough with mm. And we need, then it would be more consistent
to replace mm there as well.

> Might be possible to hold a reference to the task and grab it from
> there, though feels a bit iffy to hold a task reference from the ring on
> the task that holds a reference to the ring. Haven't looked too close,
> should work though as this won't hold a file/files reference, it's just
> a freeing reference.

BTW, if the process-creator dies, then its ->files might be killed
and ->sqo_files become dangling, so should be invalidated. Your
approach with a task's reference probably handles it naturally.

-- 
Pavel Begunkov



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux