On 1/7/25 15:42, Luis Henriques wrote:
Hi,
On Tue, Jan 07 2025, Bernd Schubert wrote:
This adds support for fuse request completion through ring SQEs
(FUSE_URING_CMD_COMMIT_AND_FETCH handling). After committing
the ring entry it becomes available for new fuse requests.
Handling of requests through the ring (SQE/CQE handling)
is complete now.
Fuse request data are copied through the mmaped ring buffer,
there is no support for any zero copy yet.
Please find below a few more comments.
Thanks, I fixed all comments, except of retry in fuse_uring_next_fuse_req.
[...]
Also, please note that I'm trying to understand this patchset (and the
whole fuse-over-io-uring thing), so most of my comments are minor nits.
And those that are not may simply be wrong! I'm just noting them as I
navigate through the code.
(And by the way, thanks for this work!)
+/*
+ * Get the next fuse req and send it
+ */
+static void fuse_uring_next_fuse_req(struct fuse_ring_ent *ring_ent,
+ struct fuse_ring_queue *queue,
+ unsigned int issue_flags)
+{
+ int err;
+ bool has_next;
+
+retry:
+ spin_lock(&queue->lock);
+ fuse_uring_ent_avail(ring_ent, queue);
+ has_next = fuse_uring_ent_assign_req(ring_ent);
+ spin_unlock(&queue->lock);
+
+ if (has_next) {
+ err = fuse_uring_send_next_to_ring(ring_ent, issue_flags);
+ if (err)
+ goto retry;
I wonder whether this is safe. Maybe this is *obviously* safe, but I'm
still trying to understand this patchset; so, for me, it is not :-)
Would it be worth it trying to limit the maximum number of retries?
No, we cannot limit retries. Let's do a simple example with one ring
entry and also just one queue. Multiple applications create fuse
requests. The first application fills the only available ring entry
and submits it, the others just get queued in queue->fuse_req_queue.
After that the application just waits request_wait_answer()
On commit of the first request the ring task has to take the next
request from queue->fuse_req_queue - if something fails with that
request it has to complete it and proceed to the next request.
If we would introduce a max-retries here, it would put the ring entry
on hold (FRRS_AVAILABLE) and until another application comes, it would
forever wait there. The applications waiting in request_wait_answer
would never complete either.
+ }
+}
+
+static int fuse_ring_ent_set_commit(struct fuse_ring_ent *ent)
+{
+ struct fuse_ring_queue *queue = ent->queue;
+
+ lockdep_assert_held(&queue->lock);
+
+ if (WARN_ON_ONCE(ent->state != FRRS_USERSPACE))
+ return -EIO;
+
+ ent->state = FRRS_COMMIT;
+ list_move(&ent->list, &queue->ent_commit_queue);
+
+ return 0;
+}
+
+/* FUSE_URING_CMD_COMMIT_AND_FETCH handler */
+static int fuse_uring_commit_fetch(struct io_uring_cmd *cmd, int issue_flags,
+ struct fuse_conn *fc)
+{
+ const struct fuse_uring_cmd_req *cmd_req = io_uring_sqe_cmd(cmd->sqe);
+ struct fuse_ring_ent *ring_ent;
+ int err;
+ struct fuse_ring *ring = fc->ring;
+ struct fuse_ring_queue *queue;
+ uint64_t commit_id = READ_ONCE(cmd_req->commit_id);
+ unsigned int qid = READ_ONCE(cmd_req->qid);
+ struct fuse_pqueue *fpq;
+ struct fuse_req *req;
+
+ err = -ENOTCONN;
+ if (!ring)
+ return err;
+
+ if (qid >= ring->nr_queues)
+ return -EINVAL;
+
+ queue = ring->queues[qid];
+ if (!queue)
+ return err;
+ fpq = &queue->fpq;
+
+ spin_lock(&queue->lock);
+ /* Find a request based on the unique ID of the fuse request
+ * This should get revised, as it needs a hash calculation and list
+ * search. And full struct fuse_pqueue is needed (memory overhead).
+ * As well as the link from req to ring_ent.
+ */
+ req = fuse_request_find(fpq, commit_id);
+ err = -ENOENT;
+ if (!req) {
+ pr_info("qid=%d commit_id %llu not found\n", queue->qid,
+ commit_id);
+ spin_unlock(&queue->lock);
+ return err;
+ }
+ list_del_init(&req->list);
+ ring_ent = req->ring_entry;
+ req->ring_entry = NULL;
+
+ err = fuse_ring_ent_set_commit(ring_ent);
+ if (err != 0) {
I'm probably missing something, but because we removed 'req' from the list
above, aren't we leaking it if we get an error here?
Hmm, yeah, that is debatable. We basically have a grave error here.
Either kernel or userspace are doing something wrong. Though probably
you are right and we should end the request with EIO.
Thanks,
Bernd