On 11/19/24 00:30, Joanne Koong wrote: > On Thu, Nov 7, 2024 at 9:04 AM Bernd Schubert <bschubert@xxxxxxx> wrote: >> >> When the fuse-server terminates while the fuse-client or kernel >> still has queued URING_CMDs, these commands retain references >> to the struct file used by the fuse connection. This prevents >> fuse_dev_release() from being invoked, resulting in a hung mount >> point. >> >> This patch addresses the issue by making queued URING_CMDs >> cancelable, allowing fuse_dev_release() to proceed as expected >> and preventing the mount point from hanging. >> >> Signed-off-by: Bernd Schubert <bschubert@xxxxxxx> >> --- >> fs/fuse/dev_uring.c | 76 ++++++++++++++++++++++++++++++++++++++++++++++++----- >> 1 file changed, 70 insertions(+), 6 deletions(-) >> >> diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c >> index 6af515458695ccb2e32cc8c62c45471e6710c15f..b465da41c42c47eaf69f09bab1423061bc8fcc68 100644 >> --- a/fs/fuse/dev_uring.c >> +++ b/fs/fuse/dev_uring.c >> @@ -23,6 +23,7 @@ MODULE_PARM_DESC(enable_uring, >> >> struct fuse_uring_cmd_pdu { >> struct fuse_ring_ent *ring_ent; >> + struct fuse_ring_queue *queue; >> }; >> >> /* >> @@ -382,6 +383,61 @@ void fuse_uring_stop_queues(struct fuse_ring *ring) >> } >> } >> >> +/* >> + * Handle IO_URING_F_CANCEL, typically should come on daemon termination >> + */ >> +static void fuse_uring_cancel(struct io_uring_cmd *cmd, >> + unsigned int issue_flags, struct fuse_conn *fc) >> +{ >> + struct fuse_uring_cmd_pdu *pdu = (struct fuse_uring_cmd_pdu *)cmd->pdu; >> + struct fuse_ring_queue *queue = pdu->queue; >> + struct fuse_ring_ent *ent; >> + bool found = false; >> + bool need_cmd_done = false; >> + >> + spin_lock(&queue->lock); >> + >> + /* XXX: This is cumbersome for large queues. */ >> + list_for_each_entry(ent, &queue->ent_avail_queue, list) { >> + if (pdu->ring_ent == ent) { >> + found = true; >> + break; >> + } >> + } > > Do we have to check that the entry is on the ent_avail_queue, or can > we assume that if the ent->state is FRRS_WAIT, the only queue it'll be > on is the ent_avail_queue? I see only one case where this isn't true, > for teardown in fuse_uring_stop_list_entries() - if we had a > workaround for this, eg having some teardown state signifying that > io_uring_cmd_done() needs to be called on the cmd and clearing > FRRS_WAIT, then we could get rid of iteration through ent_avail_queue > for every cancelled cmd. I'm scared that we would run into races - I don't want to access memory pointed to by pdu->ring_ent, before I'm not sure it is on the list. Remember the long discussion Miklos and I had about the tiny 'tag' variable and finding requests using existing hash lists [0] ? Personally I would prefer an array of struct queue_entries { struct fuse_ring_ent *ring_ent; bool valid; } in struct fuse_ring_queue { ... struct queue_entries *entries[] } And that array would only get freed on queue destruction. Besides avoiding hash lists, it would also allow to use 'valid' to know if we can access the ring_entry and then check the state. Thanks, Bernd [0] https://lore.kernel.org/linux-fsdevel/CAJfpegu_UQ1BNp0UDHeOZFWwUoXbJ_LP4W=o+UX6MB3DsJbH8g@xxxxxxxxxxxxxx/T/#t