On 3/4/21 5:15 AM, Stefan Metzmacher wrote: > > Hi Jens, > >> If the original task switches credentials or unshares any part of the >> task state, then we should notify the io_uring workers to they can >> re-fork as well. For credentials, this actually happens just fine for >> the io-wq workers, as we grab and pass that down. For SQPOLL, we're >> stuck with the original credentials, which means that it cannot be used >> if the task does eg seteuid(). > > I fear that this will be very problematic for Samba's use of io_uring. > > We change credentials very often, switching between the impersonated > users and also root in order to run privileged sections. > > Currently fd-based operations are not affected by the credential > switches. > > I guess any credential switch means that all pending io_uring requests > get canceled, correct? > > It also means the usage of IORING_REGISTER_PERSONALITY isn't useful > any longer, as that requires a credential switch before (and most > likely after) the io_uring_register() syscall. > > As seteuid(), unshare() and similar syscalls are per thread instead of > process in the kernel, the io_wq is also per userspace thread and not > per io_ring_ctx, correct? > > As I read the code any credential change in any userspace thread will > cause the sq_thread to be stopped and restarted by the next > io_uring_enter(), which means that the sq_thread may change its main > credentials randomly overtime, depending on which userspace thread > calls io_uring_enter() first. As unshare() applies only to the current > task_struct I'm wondering if we only want to refork the sq_thread if > the current task is the parent of the sq_thread. > > I'm wondering if we better remove io_uring_unshare() from > commit_creds() and always handle the creds explicitly as > req->work.creds. io_init_req() then will req->work.creds from > ctx->personality_idr or from current->cred. At the same time we'd > readd ctx->creds = get_current_cred(); in io_uring_setup() and use > these ctx->creds in the io_sq_thread again in order to make things > sane again. > > I'm also wondering if all this has an impact on > IORING_SETUP_ATTACH_WQ, in particular I'm thinking of the case where > the fd was transfered via SCM_RIGHTS or across fork(), when mm and > files are not shared between the processes. > > I think the IORING_FEAT_CUR_PERSONALITY section in io_uring_setup.2 > should also talk about what credentials are used in the > IORING_SETUP_SQPOLL case. > > The IORING_SETUP_SQPOLL section should also be more detailed regarding > what state is used in particular in combination with > IORING_SETUP_ATTACH_WQ. Wasn't the idea up to 5.11 that the sq_thread > would capture the whole state at io_uring_setup()? > > I think we need to maintain the overall behavior exposed to > userspace... Totally agree - I'll get to this a bit later, I think we have better ways of handling it. For now I'll drop this patch so we have time to rethink it. -- Jens Axboe