On 10/24/24 1:53 PM, Jann Horn wrote: > On Thu, Oct 24, 2024 at 9:50?PM Jens Axboe <axboe@xxxxxxxxx> wrote: >> On 10/24/24 12:13 PM, Jann Horn wrote: >>> On Thu, Oct 24, 2024 at 7:08?PM Jens Axboe <axboe@xxxxxxxxx> wrote: >>>> Add IORING_REGISTER_RESIZE_RINGS, which allows an application to resize >>>> the existing rings. It takes a struct io_uring_params argument, the same >>>> one which is used to setup the ring initially, and resizes rings >>>> according to the sizes given. >>> [...] >>>> + * We'll do the swap. Clear out existing mappings to prevent mmap >>>> + * from seeing them, as we'll unmap them. Any attempt to mmap existing >>>> + * rings beyond this point will fail. Not that it could proceed at this >>>> + * point anyway, as we'll hold the mmap_sem until we've done the swap. >>>> + * Likewise, hold the completion * lock over the duration of the actual >>>> + * swap. >>>> + */ >>>> + mmap_write_lock(current->mm); >>> >>> Why does the mmap lock for current->mm suffice here? I see nothing in >>> io_uring_mmap() that limits mmap() to tasks with the same mm_struct. >> >> Ehm does ->mmap() not hold ->mmap_sem already? I was under that >> understanding. Obviously if it doesn't, then yeah this won't be enough. >> Checked, and it does. >> >> Ah I see what you mean now, task with different mm. But how would that >> come about? The io_uring fd is CLOEXEC, and it can't get passed. > > Yeah, that's what I meant, tasks with different mm. I think there are > a few ways to get the io_uring fd into a different task, the ones I > can immediately think of: > > - O_CLOEXEC only applies on execve(), fork() should still inherit the fd > - O_CLOEXEC can be cleared via fcntl() > - you can use clone() to create two tasks that share FD tables > without sharing an mm OK good catch, yes then it won't be enough. Might just make sense to exclude mmap separately, then. Thanks, I'll work on that for v4! Probably trivial enough with just a separate mutex that's held over mmap, and that register grabs too. Basically replacing mmap_write_lock() on the resize side with ctx->mmap_lock, and grabbing that over validate/mmap on the io_uring mmap side. Need to ponder the implications of that in terms of how the locks nest, as you'd have mmap_sem -> ctx->mmap_lock, but the other way around for resizing. But that should just be for the pinning parts, which is why it does the mmap_sem grabbing where it does right now. -- Jens Axboe