On 10/24/24 2:32 PM, Jann Horn wrote: > On Thu, Oct 24, 2024 at 10:25?PM Jens Axboe <axboe@xxxxxxxxx> wrote: >> On 10/24/24 2:08 PM, Jann Horn wrote: >>> On Thu, Oct 24, 2024 at 9:59?PM Jens Axboe <axboe@xxxxxxxxx> wrote: >>>> On 10/24/24 1:53 PM, Jann Horn wrote: >>>>> On Thu, Oct 24, 2024 at 9:50?PM Jens Axboe <axboe@xxxxxxxxx> wrote: >>>>>> On 10/24/24 12:13 PM, Jann Horn wrote: >>>>>>> On Thu, Oct 24, 2024 at 7:08?PM Jens Axboe <axboe@xxxxxxxxx> wrote: >>>>>>>> Add IORING_REGISTER_RESIZE_RINGS, which allows an application to resize >>>>>>>> the existing rings. It takes a struct io_uring_params argument, the same >>>>>>>> one which is used to setup the ring initially, and resizes rings >>>>>>>> according to the sizes given. >>>>>>> [...] >>>>>>>> + * We'll do the swap. Clear out existing mappings to prevent mmap >>>>>>>> + * from seeing them, as we'll unmap them. Any attempt to mmap existing >>>>>>>> + * rings beyond this point will fail. Not that it could proceed at this >>>>>>>> + * point anyway, as we'll hold the mmap_sem until we've done the swap. >>>>>>>> + * Likewise, hold the completion * lock over the duration of the actual >>>>>>>> + * swap. >>>>>>>> + */ >>>>>>>> + mmap_write_lock(current->mm); >>>>>>> >>>>>>> Why does the mmap lock for current->mm suffice here? I see nothing in >>>>>>> io_uring_mmap() that limits mmap() to tasks with the same mm_struct. >>>>>> >>>>>> Ehm does ->mmap() not hold ->mmap_sem already? I was under that >>>>>> understanding. Obviously if it doesn't, then yeah this won't be enough. >>>>>> Checked, and it does. >>>>>> >>>>>> Ah I see what you mean now, task with different mm. But how would that >>>>>> come about? The io_uring fd is CLOEXEC, and it can't get passed. >>>>> >>>>> Yeah, that's what I meant, tasks with different mm. I think there are >>>>> a few ways to get the io_uring fd into a different task, the ones I >>>>> can immediately think of: >>>>> >>>>> - O_CLOEXEC only applies on execve(), fork() should still inherit the fd >>>>> - O_CLOEXEC can be cleared via fcntl() >>>>> - you can use clone() to create two tasks that share FD tables >>>>> without sharing an mm >>>> >>>> OK good catch, yes then it won't be enough. Might just make sense to >>>> exclude mmap separately, then. Thanks, I'll work on that for v4! >>> >>> Yeah, that sounds reasonable to me. >> >> Something like this should do it, it's really just replacing mmap_sem >> with a ring private lock. And since the ordering already had to deal >> with uring_lock vs mmap_sem ABBA issues, this should slot straight in as >> well. > > Looks good to me at a glance. Great, thanks for checking Jann. In the first place as well, appreciate it. FWIW, compiled and ran through the testing, looks fine so far here. -- Jens Axboe