On Fri, Oct 18, 2019 at 4:01 AM Jens Axboe <axboe@xxxxxxxxx> wrote: > This is in preparation for adding opcodes that need to modify files > in a process file table, either adding new ones or closing old ones. Closing old ones would be tricky. Basically if you call get_files_struct() while you're between an fdget()/fdput() pair (e.g. from sys_io_uring_enter()), you're not allowed to use that files_struct reference to replace or close existing FDs through that reference. (Or more accurately, if you go through fdget() with files_struct refcount 1, you must not replace/close FDs in there in any way until you've passed the corresponding fdput().) You can avoid that if you ensure that you never use fdget()/fdput() in the relevant places, only fget()/fput(). > If an opcode needs this, it must set REQ_F_NEED_FILES in the request > structure. If work that needs to get punted to async context have this > set, they will grab a reference to the process file table. When the > work is completed, the reference is dropped again. [...] > @@ -2220,6 +2223,10 @@ static void io_sq_wq_submit_work(struct work_struct *work) > set_fs(USER_DS); > } > } > + if (s->files && !old_files) { > + old_files = current->files; > + current->files = s->files; > + } AFAIK e.g. stuff like proc_fd_link() in procfs can concurrently call get_files_struct() even on kernel tasks, so you should take the task_lock(current) while fiddling with the ->files pointer. Also, maybe I'm too tired to read this correctly, but it seems like when io_sq_wq_submit_work() is processing multiple elements with ->files pointers, this part will only consume a reference to the first one? > > if (!ret) { > s->has_user = cur_mm != NULL; > @@ -2312,6 +2319,11 @@ static void io_sq_wq_submit_work(struct work_struct *work) > unuse_mm(cur_mm); > mmput(cur_mm); > } > + if (old_files) { > + struct files_struct *files = current->files; > + current->files = old_files; > + put_files_struct(files); > + } And then here the first files_struct reference is dropped, and the rest of them leak? > } > > /* > @@ -2413,6 +2425,8 @@ static int __io_queue_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req, > > s->sqe = sqe_copy; > memcpy(&req->submit, s, sizeof(*s)); > + if (req->flags & REQ_F_NEED_FILES) > + req->submit.files = get_files_struct(current); Stupid question: How does this interact with sqpoll mode? In that case, this function is running on a kernel thread that isn't sharing the application's files_struct, right?