Re: [PATCH v3 9/9] fuse: allow parallel dio writes with FUSE_DIRECT_IO_ALLOW_MMAP

Amir Goldstein <amir73il@xxxxxxxxx> · Fri, 9 Feb 2024 14:12:38 +0200

On Fri, Feb 9, 2024 at 1:48 PM Bernd Schubert
<bernd.schubert@xxxxxxxxxxx> wrote:
>
>
>
> On 2/9/24 12:21, Bernd Schubert wrote:
> >
> >
> > On 2/9/24 11:50, Miklos Szeredi wrote:
> >> On Thu, 8 Feb 2024 at 18:09, Amir Goldstein <amir73il@xxxxxxxxx> wrote:
> >>
> >>>  static int fuse_inode_get_io_cache(struct fuse_inode *fi)
> >>>  {
> >>> +       int err = 0;
> >>> +
> >>>         assert_spin_locked(&fi->lock);
> >>> -       if (fi->iocachectr < 0)
> >>> -               return -ETXTBSY;
> >>> -       if (fi->iocachectr++ == 0)
> >>> -               set_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
> >>> -       return 0;
> >>> +       /*
> >>> +        * Setting the bit advises new direct-io writes to use an exclusive
> >>> +        * lock - without it the wait below might be forever.
> >>> +        */
> >>> +       set_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
> >>> +       while (!err && fuse_is_io_cache_wait(fi)) {
> >>> +               spin_unlock(&fi->lock);
> >>> +               err = wait_event_killable(fi->direct_io_waitq,
> >>> +                                         !fuse_is_io_cache_wait(fi));
> >>> +               spin_lock(&fi->lock);
> >>> +       }
> >>> +       /*
> >>> +        * Enter caching mode or clear the FUSE_I_CACHE_IO_MODE bit if we
> >>> +        * failed to enter caching mode and no other caching open exists.
> >>> +        */
> >>> +       if (!err)
> >>> +               fi->iocachectr++;
> >>> +       else if (fi->iocachectr <= 0)
> >>> +               clear_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
> >>
> >> This seems wrong:  if the current task is killed, and there's anther
> >> task trying to get cached open mode, then clearing
> >> FUSE_I_CACHE_IO_MODE will allow new parallel writes, breaking this
> >> logic.
> >
> > This is called holding a spin lock, another task cannot enter here?
> > Neither can direct-IO, because it is also locked out. The bit helps DIO
> > code to avoid trying to do parallel DIO without the need to take a spin
> > lock. When DIO decides it wants to do parallel IO, it first has to get
> > past fi->iocachectr < 0 - if there is another task trying to do cache
> > IO, either DIO gets < 0 first and the other cache task has to wait, or
> > cache tasks gets > 0 and dio will continue with the exclusive lock. Or
> > do I miss something?
>
> Now I see what you mean, there is an unlock and another task might have also already set the bit
>
> I think this should do
>
> diff --git a/fs/fuse/iomode.c b/fs/fuse/iomode.c
> index acd0833ae873..7c22edd674cb 100644
> --- a/fs/fuse/iomode.c
> +++ b/fs/fuse/iomode.c
> @@ -41,6 +41,8 @@ static int fuse_inode_get_io_cache(struct fuse_inode *fi)
>                 err = wait_event_killable(fi->direct_io_waitq,
>                                           !fuse_is_io_cache_wait(fi));
>                 spin_lock(&fi->lock);
> +               if (!err)
> +                       /* Another interrupted task might have unset it */
> +                       set_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
>         }
>         /*
>          * Enter caching mode or clear the FUSE_I_CACHE_IO_MODE bit if we

I think this race can happen even if we remove killable_
not sure - anyway, with fuse passthrough there is another error
condition:

        /*
         * Check if inode entered passthrough io mode while waiting for parallel
         * dio write completion.
         */
        if (fuse_inode_backing(fi))
                err = -ETXTBSY;

But in this condition, all waiting tasks should abort the wait,
so it does not seem a problem to clean the flag.

Anyway, IMO it is better to set the flag before every wait and on
success. Like below.

Thanks,
Amir.

--- a/fs/fuse/iomode.c
+++ b/fs/fuse/iomode.c
@@ -35,8 +35,8 @@ static int fuse_inode_get_io_cache(struct fuse_inode *fi)
         * Setting the bit advises new direct-io writes to use an exclusive
         * lock - without it the wait below might be forever.
         */
-       set_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
        while (!err && fuse_is_io_cache_wait(fi)) {
+               set_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
                spin_unlock(&fi->lock);
                err = wait_event_killable(fi->direct_io_waitq,
                                          !fuse_is_io_cache_wait(fi));
@@ -53,8 +53,8 @@ static int fuse_inode_get_io_cache(struct fuse_inode *fi)
         * Enter caching mode or clear the FUSE_I_CACHE_IO_MODE bit if we
         * failed to enter caching mode and no other caching open exists.
         */
-       if (!err)
-               fi->iocachectr++;
+       if (!err && fi->iocachectr++ == 0)
+               set_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
        else if (fi->iocachectr <= 0)
                clear_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
        return err;