On 1/27/20 2:45 PM, Pavel Begunkov wrote: > On 27/01/2020 23:33, Jens Axboe wrote: >> On 1/27/20 7:07 AM, Pavel Begunkov wrote: >>> On 1/27/2020 4:39 PM, Jens Axboe wrote: >>>> On 1/27/20 6:29 AM, Pavel Begunkov wrote: >>>>> On 1/26/2020 8:00 PM, Jens Axboe wrote: >>>>>> On 1/26/20 8:11 AM, Pavel Begunkov wrote: >>>>>>> On 1/26/2020 4:51 AM, Daurnimator wrote: >>>>>>>> On Fri, 24 Jan 2020 at 10:16, Jens Axboe <axboe@xxxxxxxxx> wrote: >>>>> Ok. I can't promise it'll play handy for sharing. Though, you'll be out >>>>> of space in struct io_uring_params soon anyway. >>>> >>>> I'm going to keep what we have for now, as I'm really not imagining a >>>> lot more sharing - what else would we share? So let's not over-design >>>> anything. >>>> >>> Fair enough. I prefer a ptr to an extendable struct, that will take the >>> last u64, when needed. >>> >>> However, it's still better to share through file descriptors. It's just >>> not secure enough the way it's now. >> >> Is the file descriptor value really a good choice? We just had some >> confusion on ring sharing across forks. Not sure using an fd value >> is a sane "key" to use across processes. >> > As I see it, the problem with @mm is that uring is dead-bound to it. > For example, a process can create and send uring (e.g. via socket), > and then be killed. And that basically means > 1. @mm of the process is locked just because of the sent uring > instance. > 2. a process may have an io_uring, which bound to @mm of another > process, even though the layouts may be completely different. > > File descriptors are different here, because io_uring doesn't know > about them, They are controlled by the userspace (send, dup, fork, > etc), and don't sabotage all isolation work done in the kernel. A dire > example here is stealing io-wq from within a container, which is > trivial with global self-made id. I would love to hear, if I am > mistaken somewhere. > > Is there some better option? OK, so how about this: - We use the 'fd' as the lookup key. This makes it easy since we can just check if it's a io_uring instance or not, we don't need to do any tracking on the side. It also means that the application asking for sharing must already have some relationship to the process that created the ring. - mm/creds must be transferred through the work item. Any SQE done on behalf of io_uring_enter() directly already has that, if punted we must pass the creds and mm. This means we break the static setup of io_wq->mm/creds. It also means that we probably have to add that to io_wq_work, which kind of sucks, but... I think with that we have a decent setup, that's also safe. I've dropped the sharing patches for now, from the 5.6 tree. -- Jens Axboe