On 1/8/20 4:11 PM, Stefan Metzmacher wrote: > Am 09.01.20 um 00:05 schrieb Jens Axboe: >> On 1/8/20 4:03 PM, Stefan Metzmacher wrote: >>> Am 08.01.20 um 23:53 schrieb Jens Axboe: >>>> On 1/8/20 10:04 AM, Stefan Metzmacher wrote: >>>>> Am 08.01.20 um 17:40 schrieb Jens Axboe: >>>>>> On 1/8/20 9:32 AM, Stefan Metzmacher wrote: >>>>>>> Am 08.01.20 um 17:20 schrieb Jens Axboe: >>>>>>>> On 1/8/20 6:05 AM, Stefan Metzmacher wrote: >>>>>>>>> Hi Jens, >>>>>>>>> >>>>>>>>>> This works just like openat(2), except it can be performed async. For >>>>>>>>>> the normal case of a non-blocking path lookup this will complete >>>>>>>>>> inline. If we have to do IO to perform the open, it'll be done from >>>>>>>>>> async context. >>>>>>>>> >>>>>>>>> Did you already thought about the credentials being used for the async >>>>>>>>> open? The application could call setuid() and similar calls to change >>>>>>>>> the credentials of the userspace process/threads. In order for >>>>>>>>> applications like samba to use this async openat, it would be required >>>>>>>>> to specify the credentials for each open, as we have to multiplex >>>>>>>>> requests from multiple user sessions in one process. >>>>>>>>> >>>>>>>>> This applies to non-fd based syscall. Also for an async connect >>>>>>>>> to a unix domain socket. >>>>>>>>> >>>>>>>>> Do you have comments on this? >>>>>>>> >>>>>>>> The open works like any of the other commands, it inherits the >>>>>>>> credentials that the ring was setup with. Same with the memory context, >>>>>>>> file table, etc. There's currently no way to have multiple personalities >>>>>>>> within a single ring. >>>>>>> >>>>>>> Ah, it's user = get_uid(current_user()); and ctx->user = user in >>>>>>> io_uring_create(), right? >>>>>> >>>>>> That's just for the accounting, it's the: >>>>>> >>>>>> ctx->creds = get_current_cred(); >>>>> >>>>> Ok, I just looked at an old checkout. >>>>> >>>>> In kernel-dk-block/for-5.6/io_uring-vfs I see this only used in >>>>> the async processing. Does a non-blocking openat also use ctx->creds? >>>> >>>> There's basically two sets here - one set is in the ring, and the other >>>> is the identity that the async thread (briefly) assumes if we have to go >>>> async. Right now they are the same thing, and hence we don't need to >>>> play any tricks off the system call submitting SQEs to assume any other >>>> identity than the one we have. >>> >>> I see two cases using it io_sq_thread() and >>> io_wq_create()->io_worker_handle_work() call override_creds(). >>> >>> But aren't non-blocking syscall executed in the context of the thread >>> calling io_uring_enter()->io_submit_sqes()? >>> In only see some magic around ctx->sqo_mm for that case, but ctx->creds >>> doesn't seem to be used in that case. And my design would require that. >> >> For now, the sq thread (which is used if you use IORING_SETUP_SQPOLL) >> currently requires fixed files, so it can't be used with open at the >> moment anyway. But if/when enabled, it'll assume the same credentials >> as the async context and syscall path. > > I'm sorry, but I'm still unsure we're talking about the same thing > (or maybe I'm missing some basics here). > > My understanding of the io_uring_enter() is that it will execute as much > non-blocking calls as it can without switching to any other kernel thread. Correct, any SQE that we can do without switching, we will. > And my fear is that openat will use get_current_cred() instead of > ctx->creds. OK, I think I follow your concern. So you'd like to setup the rings from a _different_ user, and then later on use it for submission for SQEs that a specific user. So sort of the same as our initial discussion, except the mapping would be static. The difference being that you might setup the ring from a different user than the user that would be submitting IO on it? If so, then we do need something to support that, probably an IORING_REGISTER_CREDS or similar. This would allow you to replace the creds you currently have in ctx->creds with whatever new one. > I'm I missing something? I think we're talking about the same thing, just different views of it :-) -- Jens Axboe