On 11/19/24 03:02, Joanne Koong wrote: > On Mon, Nov 18, 2024 at 3:47 PM Bernd Schubert > <bernd.schubert@xxxxxxxxxxx> wrote: >> >> On 11/19/24 00:30, Joanne Koong wrote: >>> On Thu, Nov 7, 2024 at 9:04 AM Bernd Schubert <bschubert@xxxxxxx> wrote: >>>> >>>> When the fuse-server terminates while the fuse-client or kernel >>>> still has queued URING_CMDs, these commands retain references >>>> to the struct file used by the fuse connection. This prevents >>>> fuse_dev_release() from being invoked, resulting in a hung mount >>>> point. >>>> >>>> This patch addresses the issue by making queued URING_CMDs >>>> cancelable, allowing fuse_dev_release() to proceed as expected >>>> and preventing the mount point from hanging. >>>> >>>> Signed-off-by: Bernd Schubert <bschubert@xxxxxxx> >>>> --- >>>> fs/fuse/dev_uring.c | 76 ++++++++++++++++++++++++++++++++++++++++++++++++----- >>>> 1 file changed, 70 insertions(+), 6 deletions(-) >>>> >>>> diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c >>>> index 6af515458695ccb2e32cc8c62c45471e6710c15f..b465da41c42c47eaf69f09bab1423061bc8fcc68 100644 >>>> --- a/fs/fuse/dev_uring.c >>>> +++ b/fs/fuse/dev_uring.c >>>> @@ -23,6 +23,7 @@ MODULE_PARM_DESC(enable_uring, >>>> >>>> struct fuse_uring_cmd_pdu { >>>> struct fuse_ring_ent *ring_ent; >>>> + struct fuse_ring_queue *queue; >>>> }; >>>> >>>> /* >>>> @@ -382,6 +383,61 @@ void fuse_uring_stop_queues(struct fuse_ring *ring) >>>> } >>>> } >>>> >>>> +/* >>>> + * Handle IO_URING_F_CANCEL, typically should come on daemon termination >>>> + */ >>>> +static void fuse_uring_cancel(struct io_uring_cmd *cmd, >>>> + unsigned int issue_flags, struct fuse_conn *fc) >>>> +{ >>>> + struct fuse_uring_cmd_pdu *pdu = (struct fuse_uring_cmd_pdu *)cmd->pdu; >>>> + struct fuse_ring_queue *queue = pdu->queue; >>>> + struct fuse_ring_ent *ent; >>>> + bool found = false; >>>> + bool need_cmd_done = false; >>>> + >>>> + spin_lock(&queue->lock); >>>> + >>>> + /* XXX: This is cumbersome for large queues. */ >>>> + list_for_each_entry(ent, &queue->ent_avail_queue, list) { >>>> + if (pdu->ring_ent == ent) { >>>> + found = true; >>>> + break; >>>> + } >>>> + } >>> >>> Do we have to check that the entry is on the ent_avail_queue, or can >>> we assume that if the ent->state is FRRS_WAIT, the only queue it'll be >>> on is the ent_avail_queue? I see only one case where this isn't true, >>> for teardown in fuse_uring_stop_list_entries() - if we had a >>> workaround for this, eg having some teardown state signifying that >>> io_uring_cmd_done() needs to be called on the cmd and clearing >>> FRRS_WAIT, then we could get rid of iteration through ent_avail_queue >>> for every cancelled cmd. >> >> >> I'm scared that we would run into races - I don't want to access memory >> pointed to by pdu->ring_ent, before I'm not sure it is on the list. > > Oh, I was seeing that we mark the cmd as cancellable (eg in > fuse_uring_prepare_cancel()) only after the ring_ent is moved to the > ent_avail_queue (in fuse_uring_ent_avail()) and that this is done in > the scope of the queue->lock, so we would only call into > fuse_uring_cancel() when the ring_ent is on the list. Could there > still be a race condition here? > > Alternatively, inspired by your "bool valid;" idea below, maybe > another solution would be having a bit in "struct fuse_ring_ent" > tracking if io_uring_cmd_done() needs to be called on it? What I mean is that daemon termination might race with normal umount. Umount does everything cleanly and iterates through lists, but might free 'struct fuse_ring_ent', see fuse_uring_entry_teardown(). On the other hand, daemon termination with IO_URING_F_CANCEL has the pointer to ring_ent - but that pointer might be already freed by umount. That also means another bit in "struct fuse_ring_ent" won't help. > > This is fairly unimportant though - this part could always be > optimized in a future patchset if you think it needs to be optimized, > but was just curious if these would work. > I'm going to change logic a bit and will introduce another list 'freeable_ring_ent'. Entries will be moved to that new list and only freed in fuse_uring_destruct(). After that IO_URING_F_CANCEL can check stat of ring_ent directly Thanks for the discussion! Bernd