On 1/28/20 9:19 AM, Jens Axboe wrote: > On 1/28/20 9:17 AM, Stefan Metzmacher wrote: >> Am 28.01.20 um 17:10 schrieb Jens Axboe: >>> On 1/28/20 3:18 AM, Stefan Metzmacher wrote: >>>> Hi Jens, >>>> >>>> now that we have IORING_FEAT_CUR_PERSONALITY... >>>> >>>> How can we optimize the fileserver case now, in order to avoid the >>>> overhead of always calling 5 syscalls before io_uring_enter()?: >>>> >>>> /* gain root again */ >>>> setresuid(-1,0,-1); setresgid(-1,0,-1) >>>> /* impersonate the user with groups */ >>>> setgroups(num, grps); setresgid(-1,gid,-1); setresuid(-1,uid,-1); >>>> /* trigger the operation */ >>>> io_uring_enter(); >>>> >>>> I guess some kind of IORING_REGISTER_CREDS[_UPDATE] would be >>>> good, together with a IOSQE_FIXED_CREDS in order to specify >>>> credentials per operation. >>>> >>>> Or we make it much more generic and introduce a credsfd_create() >>>> syscall in order to get an fd for a credential handle, maybe >>>> together with another syscall to activate the credentials of >>>> the current thread (or let a write to the fd trigger the activation >>>> in order to avoid an additional syscall number). >>>> >>>> Having just an fd would allow IORING_REGISTER_CREDS[_UPDATE] >>>> to be just an array of int values instead of a more complex >>>> structure to define the credentials. >>> >>> I'd rather avoid having to add more infrastructure for this, even if >>> credsfd_create() would be nifty. >>> >>> With that in mind, something like: >>> >>> - Application does IORING_REGISTER_CREDS, which returns some index >>> >>> - Add a IORING_OP_USE_CREDS opcode, which sets the creds associated >>> with dependent commands >>> - Actual request is linked to the IORING_OP_USE_CREDS command, any >>> link off IORING_OP_USE_CREDS will use those credentials >> >> Using links for this sounds ok. > > Great! I'll try and hack this up and see how it turns out. > >>> - IORING_UNREGISTER_CREDS removes the registered creds >>> >>> Just throwing that out there, definitely willing to entertain other >>> methods that make sense for this. Trying to avoid needing to put this >>> information in the SQE itself, hence the idea to use a chain of links >>> for it. >>> >>> The downside is that we'll need to maintain an array of key -> creds, >>> but that's probably not a big deal. >>> >>> What do you think? >> >> So IORING_REGISTER_CREDS would be a simple operation that just takes a >> snapshot of the current_cred() and returns an id that can be passed to >> IORING_OP_USE_CREDS or IORING_UNREGISTER_CREDS? > > Right, you would not pass in any arguments, it'd have to be run from the > personality you wish to register. It simply returns an integer, which is > a key to use for IORING_OP_USE_CREDS, or at the end for > IORING_UNREGISTER_CREDS when you no longer wish to use this personality. > >>> Ideally I'd like to get this done for 5.6 even if we >>> are a bit late, so you'll have everything you need with that release. >> >> That would be great! > > Crossing fingers... OK, so here are two patches for testing: https://git.kernel.dk/cgit/linux-block/log/?h=for-5.6/io_uring-vfs-creds #1 adds support for registering the personality of the invoking task, and #2 adds support for IORING_OP_USE_CREDS. Right now it's limited to just having one link, it doesn't support a chain of them. I'll try and write a test case for this just to see if it actually works, so far it's totally untested. Adding Pavel to the CC. -- Jens Axboe