On 9/11/20 1:23 PM, Pavel Begunkov wrote: > On 10/09/2020 21:18, Jens Axboe wrote: >> On 9/10/20 7:11 AM, Jens Axboe wrote: >>> On 9/10/20 6:37 AM, Pavel Begunkov wrote: >>>> On 09/09/2020 19:07, Jens Axboe wrote: >>>>> On 9/9/20 9:48 AM, Pavel Begunkov wrote: >>>>>> On 09/09/2020 16:10, Jens Axboe wrote: >>>>>>> On 9/9/20 1:09 AM, Pavel Begunkov wrote: >>>>>>>> On 09/09/2020 01:54, Jens Axboe wrote: >>>>>>>>> On 9/8/20 3:22 PM, Jens Axboe wrote: >>>>>>>>>> On 9/8/20 2:58 PM, Pavel Begunkov wrote: >>>>>>>>>>> On 08/09/2020 20:48, Jens Axboe wrote: >>>>>>>>>>>> Fd instantiating commands like IORING_OP_ACCEPT now work with SQPOLL, but >>>>>>>>>>>> we have an error in grabbing that if IOSQE_ASYNC is set. Ensure we assign >>>>>>>>>>>> the ring fd/file appropriately so we can defer grab them. >>>>>>>>>>> >>>>>>>>>>> IIRC, for fcheck() in io_grab_files() to work it should be under fdget(), >>>>>>>>>>> that isn't the case with SQPOLL threads. Am I mistaken? >>>>>>>>>>> >>>>>>>>>>> And it looks strange that the following snippet will effectively disable >>>>>>>>>>> such requests. >>>>>>>>>>> >>>>>>>>>>> fd = dup(ring_fd) >>>>>>>>>>> close(ring_fd) >>>>>>>>>>> ring_fd = fd >>>>>>>>>> >>>>>>>>>> Not disagreeing with that, I think my initial posting made it clear >>>>>>>>>> it was a hack. Just piled it in there for easier testing in terms >>>>>>>>>> of functionality. >>>>>>>>>> >>>>>>>>>> But the next question is how to do this right...> >>>>>>>>> Looking at this a bit more, and I don't necessarily think there's a >>>>>>>>> better option. If you dup+close, then it just won't work. We have no >>>>>>>>> way of knowing if the 'fd' changed, but we can detect if it was closed >>>>>>>>> and then we'll end up just EBADF'ing the requests. >>>>>>>>> >>>>>>>>> So right now the answer is that we can support this just fine with >>>>>>>>> SQPOLL, but you better not dup and close the original fd. Which is not >>>>>>>>> ideal, but better than NOT being able to support it. >>>>>>>>> >>>>>>>>> Only other option I see is to to provide an io_uring_register() >>>>>>>>> command to update the fd/file associated with it. Which may be useful, >>>>>>>>> it allows a process to indeed to this, if it absolutely has to. >>>>>>>> >>>>>>>> Let's put aside such dirty hacks, at least until someone actually >>>>>>>> needs it. Ideally, for many reasons I'd prefer to get rid of >>>>>>> >>>>>>> BUt it is actually needed, otherwise we're even more in a limbo state of >>>>>>> "SQPOLL works for most things now, just not all". And this isn't that >>>>>>> hard to make right - on the flush() side, we just need to park/stall the >>>>>> >>>>>> I understand that it isn't hard, but I just don't want to expose it to >>>>>> the userspace, a) because it's a userspace API, so couldn't probably be >>>>>> killed in the future, b) works around kernel's problems, and so >>>>>> shouldn't really be exposed to the userspace in normal circumstances. >>>>>> >>>>>> And it's not generic enough because of a possible "many fds -> single >>>>>> file" mapping, and there will be a lot of questions and problems. >>>>>> >>>>>> e.g. if a process shares a io_uring with another process, then >>>>>> dup()+close() would require not only this hook but also additional >>>>>> inter-process synchronisation. And so on. >>>>> >>>>> I think you're blowing this out of proportion. Just to restate the >>>> >>>> I just think that if there is a potentially cleaner solution without >>>> involving userspace, we should try to look for it first, even if it >>>> would take more time. That was the point. >>> >>> Regardless of whether or not we can eliminate that need, at least it'll >>> be a relaxing of the restriction, not an increase of it. It'll never >>> hurt to do an extra system call for the case where you're swapping fds. >>> I do get your point, I just don't think it's a big deal. >> >> BTW, I don't see how we can ever get rid of a need to enter the kernel, >> we'd need some chance at grabbing the updated ->files, for instance. > > Thanks for taking a look. > Yeah, agree, it should get it from somewhere, and that reminds me that > we have a similar situation with sqo_mm -- it grabs it from the > task-creator and keeps it to the end... Do we really need to set > ->files of another thread? Retaining to the end seem to work well > enough with mm. And we need, then it would be more consistent > to replace mm there as well. The files can change, so we need the juggling. >> Might be possible to hold a reference to the task and grab it from >> there, though feels a bit iffy to hold a task reference from the ring on >> the task that holds a reference to the ring. Haven't looked too close, >> should work though as this won't hold a file/files reference, it's just >> a freeing reference. > > BTW, if the process-creator dies, then its ->files might be killed > and ->sqo_files become dangling, so should be invalidated. Your > approach with a task's reference probably handles it naturally. We do prune and cancel if the process goes away, so it shouldn't have that issue. But yes, it falls out naturally with the task based approach. -- Jens Axboe