Re: [PATCH RFC v5 15/16] fuse: {io-uring} Prevent mount point hang on fuse-server termination

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 18, 2024 at 3:47 PM Bernd Schubert
<bernd.schubert@xxxxxxxxxxx> wrote:
>
> On 11/19/24 00:30, Joanne Koong wrote:
> > On Thu, Nov 7, 2024 at 9:04 AM Bernd Schubert <bschubert@xxxxxxx> wrote:
> >>
> >> When the fuse-server terminates while the fuse-client or kernel
> >> still has queued URING_CMDs, these commands retain references
> >> to the struct file used by the fuse connection. This prevents
> >> fuse_dev_release() from being invoked, resulting in a hung mount
> >> point.
> >>
> >> This patch addresses the issue by making queued URING_CMDs
> >> cancelable, allowing fuse_dev_release() to proceed as expected
> >> and preventing the mount point from hanging.
> >>
> >> Signed-off-by: Bernd Schubert <bschubert@xxxxxxx>
> >> ---
> >>  fs/fuse/dev_uring.c | 76 ++++++++++++++++++++++++++++++++++++++++++++++++-----
> >>  1 file changed, 70 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c
> >> index 6af515458695ccb2e32cc8c62c45471e6710c15f..b465da41c42c47eaf69f09bab1423061bc8fcc68 100644
> >> --- a/fs/fuse/dev_uring.c
> >> +++ b/fs/fuse/dev_uring.c
> >> @@ -23,6 +23,7 @@ MODULE_PARM_DESC(enable_uring,
> >>
> >>  struct fuse_uring_cmd_pdu {
> >>         struct fuse_ring_ent *ring_ent;
> >> +       struct fuse_ring_queue *queue;
> >>  };
> >>
> >>  /*
> >> @@ -382,6 +383,61 @@ void fuse_uring_stop_queues(struct fuse_ring *ring)
> >>         }
> >>  }
> >>
> >> +/*
> >> + * Handle IO_URING_F_CANCEL, typically should come on daemon termination
> >> + */
> >> +static void fuse_uring_cancel(struct io_uring_cmd *cmd,
> >> +                             unsigned int issue_flags, struct fuse_conn *fc)
> >> +{
> >> +       struct fuse_uring_cmd_pdu *pdu = (struct fuse_uring_cmd_pdu *)cmd->pdu;
> >> +       struct fuse_ring_queue *queue = pdu->queue;
> >> +       struct fuse_ring_ent *ent;
> >> +       bool found = false;
> >> +       bool need_cmd_done = false;
> >> +
> >> +       spin_lock(&queue->lock);
> >> +
> >> +       /* XXX: This is cumbersome for large queues. */
> >> +       list_for_each_entry(ent, &queue->ent_avail_queue, list) {
> >> +               if (pdu->ring_ent == ent) {
> >> +                       found = true;
> >> +                       break;
> >> +               }
> >> +       }
> >
> > Do we have to check that the entry is on the ent_avail_queue, or can
> > we assume that if the ent->state is FRRS_WAIT, the only queue it'll be
> > on is the ent_avail_queue? I see only one case where this isn't true,
> > for teardown in fuse_uring_stop_list_entries() - if we had a
> > workaround for this, eg having some teardown state signifying that
> > io_uring_cmd_done() needs to be called on the cmd and clearing
> > FRRS_WAIT, then we could get rid of iteration through ent_avail_queue
> > for every cancelled cmd.
>
>
> I'm scared that we would run into races - I don't want to access memory
> pointed to by pdu->ring_ent, before I'm not sure it is on the list.

Oh, I was seeing that we mark the cmd as cancellable (eg in
fuse_uring_prepare_cancel()) only after the ring_ent is moved to the
ent_avail_queue (in fuse_uring_ent_avail()) and that this is done in
the scope of the queue->lock, so we would only call into
fuse_uring_cancel() when the ring_ent is on the list. Could there
still be a race condition here?

Alternatively, inspired by your "bool valid;" idea below, maybe
another solution would be having a bit in "struct fuse_ring_ent"
tracking if io_uring_cmd_done() needs to be called on it?

This is fairly unimportant though - this part could always be
optimized in a future patchset if you think it needs to be optimized,
but was just curious if these would work.


Thanks,
Joanne

> Remember the long discussion Miklos and I had about the tiny 'tag'
> variable and finding requests using existing hash lists [0] ?
> Personally I would prefer an array of
>
> struct queue_entries {
>         struct fuse_ring_ent *ring_ent;
>         bool valid;
> }
>
>
> in struct fuse_ring_queue {
>     ...
>     struct queue_entries *entries[]
> }
>
> And that array would only get freed on queue destruction. Besides
> avoiding hash lists, it would also allow to use 'valid' to know if
> we can access the ring_entry and then check the state.
>
> Thanks,
> Bernd
>
>
> [0] https://lore.kernel.org/linux-fsdevel/CAJfpegu_UQ1BNp0UDHeOZFWwUoXbJ_LP4W=o+UX6MB3DsJbH8g@xxxxxxxxxxxxxx/T/#t





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux