On 2/23/20 4:02 AM, Pavel Begunkov wrote: > On 23/02/2020 09:26, Jens Axboe wrote: >> On 2/22/20 11:00 PM, Jens Axboe wrote: >>> On 2/21/20 12:10 PM, Jens Axboe wrote: >>>>> Got it. Then, it may happen in the future after returning from >>>>> __io_arm_poll_handler() and io_uring_enter(). And by that time io_submit_sqes() >>>>> should have already restored creds (i.e. personality stuff) on the way back. >>>>> This might be a problem. >>>> >>>> Not sure I follow, can you elaborate? Just to be sure, the requests that >>>> go through the poll handler will go through __io_queue_sqe() again. Oh I >>>> guess your point is that that is one level below where we normally >>>> assign the creds. >>> >>> Fixed this one. > > Looking at > > io_async_task_func() { > ... > /* ensure req->work.creds is valid for __io_queue_sqe() */ > req->work.creds = apoll->work.creds; > } > > It copies creds, but doesn't touch the rest req->work fields. And if you have > one, you most probably got all of them in *grab_env(). Are you sure it doesn't > leak, e.g. mmgrab()'ed mm? You're looking at a version that only existed for about 20 min, had to check I pushed it out. But ce21471abe0fef is the current one, it does a full memcpy() of it. >>>>> BTW, Is it by design, that all requests of a link use personality creds >>>>> specified in the head's sqe? >>>> >>>> No, I think that's more by accident. We should make sure they use the >>>> specified creds, regardless of the issue time. Care to clean that up? >>>> Would probably help get it right for the poll case, too. >>> >>> Took a look at this, and I think you're wrong. Every iteration of >>> io_submit_sqe() will lookup the right creds, and assign them to the >>> current task in case we're going to issue it. In the case of a link >>> where we already have the head, then we grab the current work >>> environment. This means assigning req->work.creds from >>> get_current_cred(), if not set, and these are the credentials we looked >>> up already. > > Yeah, I've spotted that there something wrong, but never looked up properly. And thanks for that! >> What does look wrong is that we don't restore the right credentials for >> queuing the head, so basically the opposite problem. Something like the >> below should fix that. >> index de650df9ac53..59024e4757d6 100644 >> --- a/fs/io_uring.c >> +++ b/fs/io_uring.c >> @@ -4705,11 +4705,18 @@ static void __io_queue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe) >> { >> struct io_kiocb *linked_timeout; >> struct io_kiocb *nxt = NULL; >> + const struct cred *old_creds = NULL; >> int ret; >> >> again: >> linked_timeout = io_prep_linked_timeout(req); >> >> + if (req->work.creds && req->work.creds != get_current_cred()) { > > get_current_cred() gets a ref. Oops yes > See my attempt below, it fixes miscount, and should work better for > cases changing back to initial creds (i.e. personality 0) Thanks, I'll fold this in, if you don't mind. > Anyway, creds handling is too scattered across the code, and this do a > lot of useless refcounting and bouncing. It's better to find it a > better place in the near future. I think a good cleanup on top of this would be to move the personality lookup to io_req_defer_prep(), and kill it from io_submit_sqe(). Now __io_issue_sqe() does the right thing, and it'll just fall out nicely with that as far as I can tell. Care to send a patch for that? -- Jens Axboe