On 18/02/2021 12:27, Lennert Buytenhek wrote: > IORING_OP_GETDENTS behaves much like getdents64(2) and takes the same > arguments, but with a small twist: it takes an additional offset > argument, and reading from the specified directory starts at the given > offset. > > For the first IORING_OP_GETDENTS call on a directory, the offset > parameter can be set to zero, and for subsequent calls, it can be > set to the ->d_off field of the last struct linux_dirent64 returned > by the previous IORING_OP_GETDENTS call. > > Internally, if necessary, IORING_OP_GETDENTS will vfs_llseek() to > the right directory position before calling vfs_getdents(). > > IORING_OP_GETDENTS may or may not update the specified directory's > file offset, and the file offset should not be relied upon having > any particular value during or after an IORING_OP_GETDENTS call. > > Signed-off-by: Lennert Buytenhek <buytenh@xxxxxxxxxxxxxx> > --- > fs/io_uring.c | 73 +++++++++++++++++++++++++++++++++++ > include/uapi/linux/io_uring.h | 1 + > 2 files changed, 74 insertions(+) > > diff --git a/fs/io_uring.c b/fs/io_uring.c > index 056bd4c90ade..6853bf48369a 100644 > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -635,6 +635,13 @@ struct io_mkdir { > struct filename *filename; > }; > [...] > +static int io_getdents(struct io_kiocb *req, unsigned int issue_flags) > +{ > + struct io_getdents *getdents = &req->getdents; > + bool pos_unlock = false; > + int ret = 0; > + > + /* getdents always requires a blocking context */ > + if (issue_flags & IO_URING_F_NONBLOCK) > + return -EAGAIN; > + > + /* for vfs_llseek and to serialize ->iterate_shared() on this file */ > + if (file_count(req->file) > 1) { Looks racy, is it safe? E.g. can be concurrently dupped and used, or just several similar IORING_OP_GETDENTS requests. > + pos_unlock = true; > + mutex_lock(&req->file->f_pos_lock); > + } > + > + if (req->file->f_pos != getdents->pos) { > + loff_t res = vfs_llseek(req->file, getdents->pos, SEEK_SET); I may be missing the previous discussions, but can this ever become stateless, like passing an offset? Including readdir.c and beyond. > + if (res < 0) > + ret = res; > + } > + > + if (ret == 0) { > + ret = vfs_getdents(req->file, getdents->dirent, > + getdents->count); > + } > + > + if (pos_unlock) > + mutex_unlock(&req->file->f_pos_lock); > + > + if (ret < 0) { > + if (ret == -ERESTARTSYS) > + ret = -EINTR; > + req_set_fail_links(req); > + } > + io_req_complete(req, ret); > + return 0; > +} [...] -- Pavel Begunkov