On 03/11/2020 00:05, Jens Axboe wrote: > On 11/2/20 1:52 PM, Vito Caputo wrote: >> Hello list, >> >> I've been tinkering a bit with some async continuation passing style >> IO-oriented code employing liburing. This exposed a kind of awkward >> behavior I suspect could be better from an ergonomics perspective. >> >> Imagine a bunch of OPENAT SQEs have been prepared, and they're all >> relative to a common dirfd. Once io_uring_submit() has consumed all >> these SQEs across the syscall boundary, logically it seems the dirfd >> should be safe to close, since these dirfd-dependent operations have >> all been submitted to the kernel. >> >> But when I attempted this, the subsequent OPENAT CQE results were all >> -EBADFD errors. It appeared the submit didn't add any references to >> the dependent dirfd. >> >> To work around this, I resorted to stowing the dirfd and maintaining a >> shared refcount in the closures associated with these SQEs and >> executed on their CQEs. This effectively forced replicating the >> batched relationship implicit in the shared parent dirfd, where I >> otherwise had zero need to. Just so I could defer closing the dirfd >> until once all these closures had run on their respective CQE arrivals >> and the refcount for the batch had reached zero. >> >> It doesn't seem right. If I ensure sufficient queue depth and >> explicitly flush all the dependent SQEs beforehand >> w/io_uring_submit(), it seems like I should be able to immediately >> close(dirfd) and have the close be automagically deferred until the >> last dependent CQE removes its reference from the kernel side. > > We pass the 'dfd' straight on, and only the async part acts on it. > Which is why it needs to be kept open. But I wonder if we can get > around it by just pinning the fd for the duration. Since you didn't > include a test case, can you try with this patch applied? Totally > untested... afaik this doesn't pin an fd in a file table, so the app closes and dfd right after submit and then do_filp_open() tries to look up closed dfd. Doesn't seem to work, and we need to pass that struct file to do_filp_open(). > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index 1f555e3c44cd..b3a647dd206b 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -3769,6 +3769,9 @@ static int __io_openat_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe > req->open.how.flags |= O_LARGEFILE; > > req->open.dfd = READ_ONCE(sqe->fd); > + if (req->open.dfd != -1 && req->open.dfd != AT_FDCWD) > + req->file = fget(req->open.dfd); > + > fname = u64_to_user_ptr(READ_ONCE(sqe->addr)); > req->open.filename = getname(fname); > if (IS_ERR(req->open.filename)) {o > @@ -3841,6 +3844,8 @@ static int io_openat2(struct io_kiocb *req, bool force_nonblock) > } > err: > putname(req->open.filename); > + if (req->file) > + fput(req->file); > req->flags &= ~REQ_F_NEED_CLEANUP; > if (ret < 0) > req_set_fail_links(req); > @@ -5876,6 +5881,8 @@ static void __io_clean_op(struct io_kiocb *req) > case IORING_OP_OPENAT2: > if (req->open.filename) > putname(req->open.filename); > + if (req->file) > + fput(req->file); > break; > } > req->flags &= ~REQ_F_NEED_CLEANUP; > -- Pavel Begunkov