On 11/27/24 13:40, Bernd Schubert wrote:
This adds basic support for ring SQEs (with opcode=IORING_OP_URING_CMD). For now only FUSE_URING_REQ_FETCH is handled to register queue entries. Signed-off-by: Bernd Schubert <bschubert@xxxxxxx> --- fs/fuse/Kconfig | 12 ++ fs/fuse/Makefile | 1 + fs/fuse/dev.c | 4 + fs/fuse/dev_uring.c | 318 ++++++++++++++++++++++++++++++++++++++++++++++ fs/fuse/dev_uring_i.h | 115 +++++++++++++++++ fs/fuse/fuse_i.h | 5 + fs/fuse/inode.c | 10 ++ include/uapi/linux/fuse.h | 67 ++++++++++ 8 files changed, 532 insertions(+)
diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c
new file mode 100644 index 0000000000000000000000000000000000000000..af9c5f116ba1dcf6c01d0359d1a06491c92c32f9 --- /dev/null +++ b/fs/fuse/dev_uring.c
...
+ +/* + * fuse_uring_req_fetch command handling + */ +static void _fuse_uring_fetch(struct fuse_ring_ent *ring_ent, + struct io_uring_cmd *cmd, + unsigned int issue_flags) +{ + struct fuse_ring_queue *queue = ring_ent->queue; + + spin_lock(&queue->lock); + fuse_uring_ent_avail(ring_ent, queue); + ring_ent->cmd = cmd; + spin_unlock(&queue->lock); +} + +/* + * sqe->addr is a ptr to an iovec array, iov[0] has the headers, iov[1] + * the payload + */ +static int fuse_uring_get_iovec_from_sqe(const struct io_uring_sqe *sqe, + struct iovec iov[FUSE_URING_IOV_SEGS]) +{ + struct iovec __user *uiov = u64_to_user_ptr(READ_ONCE(sqe->addr)); + struct iov_iter iter; + ssize_t ret; + + if (sqe->len != FUSE_URING_IOV_SEGS) + return -EINVAL; + + /* + * Direction for buffer access will actually be READ and WRITE, + * using write for the import should include READ access as well. + */ + ret = import_iovec(WRITE, uiov, FUSE_URING_IOV_SEGS, + FUSE_URING_IOV_SEGS, &iov, &iter);
You're throwing away the iterator, I'd be a bit cautious about it. FUSE_URING_IOV_SEGS is 2, so it should avoid ITER_UBUF, but Jens can say if it's abuse of the API or not. Fwiw, it's not the first place I know of that just want to get an iovec avoiding playing games with different iterator modes.
+ if (ret < 0) + return ret; + + return 0; +} + +static int fuse_uring_fetch(struct io_uring_cmd *cmd, unsigned int issue_flags, + struct fuse_conn *fc) +{ + const struct fuse_uring_cmd_req *cmd_req = io_uring_sqe_cmd(cmd->sqe);
You need to check that the ring is setup with SQE128. (!(issue_flags & IO_URING_F_SQE128)) // fail
+ struct fuse_ring *ring = fc->ring; + struct fuse_ring_queue *queue; + struct fuse_ring_ent *ring_ent; + int err; + struct iovec iov[FUSE_URING_IOV_SEGS]; + + err = fuse_uring_get_iovec_from_sqe(cmd->sqe, iov); + if (err) { + pr_info_ratelimited("Failed to get iovec from sqe, err=%d\n", + err); + return err; + } + + err = -ENOMEM; + if (!ring) { + ring = fuse_uring_create(fc); + if (!ring) + return err; + } + + queue = ring->queues[cmd_req->qid]; + if (!queue) { + queue = fuse_uring_create_queue(ring, cmd_req->qid); + if (!queue) + return err; + } + + /* + * The created queue above does not need to be destructed in + * case of entry errors below, will be done at ring destruction time. + */ + + ring_ent = kzalloc(sizeof(*ring_ent), GFP_KERNEL_ACCOUNT); + if (ring_ent == NULL) + return err; + + INIT_LIST_HEAD(&ring_ent->list); + + ring_ent->queue = queue; + ring_ent->cmd = cmd;
nit: seems it's also set immediately after in _fuse_uring_fetch().
+ + err = -EINVAL; + if (iov[0].iov_len < sizeof(struct fuse_uring_req_header)) { + pr_info_ratelimited("Invalid header len %zu\n", iov[0].iov_len); + goto err; + } +
...
+/* + * Entry function from io_uring to handle the given passthrough command + * (op cocde IORING_OP_URING_CMD) + */ +int fuse_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags) +{ + struct fuse_dev *fud; + struct fuse_conn *fc; + u32 cmd_op = cmd->cmd_op; + int err; + + /* Disabled for now, especially as teardown is not implemented yet */ + pr_info_ratelimited("fuse-io-uring is not enabled yet\n"); + return -EOPNOTSUPP;
Do compilers give warnings about such things? Unreachable code, maybe. I don't care much, but if they do to avoid breaking CONFIG_WERROR you might want to do sth about it. E.g. I'd usually mark the function __maybe_unused and not set it into fops until a later patch. ... -- Pavel Begunkov