Re: [PATCH v3 9/9] fuse: allow parallel dio writes with FUSE_DIRECT_IO_ALLOW_MMAP

Amir Goldstein <amir73il@xxxxxxxxx> · Fri, 9 Feb 2024 14:19:22 +0200

On Fri, Feb 9, 2024 at 2:12 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote:
>
> On Fri, Feb 9, 2024 at 1:48 PM Bernd Schubert
> <bernd.schubert@xxxxxxxxxxx> wrote:
> >
> >
> >
> > On 2/9/24 12:21, Bernd Schubert wrote:
> > >
> > >
> > > On 2/9/24 11:50, Miklos Szeredi wrote:
> > >> On Thu, 8 Feb 2024 at 18:09, Amir Goldstein <amir73il@xxxxxxxxx> wrote:
> > >>
> > >>>  static int fuse_inode_get_io_cache(struct fuse_inode *fi)
> > >>>  {
> > >>> +       int err = 0;
> > >>> +
> > >>>         assert_spin_locked(&fi->lock);
> > >>> -       if (fi->iocachectr < 0)
> > >>> -               return -ETXTBSY;
> > >>> -       if (fi->iocachectr++ == 0)
> > >>> -               set_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
> > >>> -       return 0;
> > >>> +       /*
> > >>> +        * Setting the bit advises new direct-io writes to use an exclusive
> > >>> +        * lock - without it the wait below might be forever.
> > >>> +        */
> > >>> +       set_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
> > >>> +       while (!err && fuse_is_io_cache_wait(fi)) {
> > >>> +               spin_unlock(&fi->lock);
> > >>> +               err = wait_event_killable(fi->direct_io_waitq,
> > >>> +                                         !fuse_is_io_cache_wait(fi));
> > >>> +               spin_lock(&fi->lock);
> > >>> +       }
> > >>> +       /*
> > >>> +        * Enter caching mode or clear the FUSE_I_CACHE_IO_MODE bit if we
> > >>> +        * failed to enter caching mode and no other caching open exists.
> > >>> +        */
> > >>> +       if (!err)
> > >>> +               fi->iocachectr++;
> > >>> +       else if (fi->iocachectr <= 0)
> > >>> +               clear_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
> > >>
> > >> This seems wrong:  if the current task is killed, and there's anther
> > >> task trying to get cached open mode, then clearing
> > >> FUSE_I_CACHE_IO_MODE will allow new parallel writes, breaking this
> > >> logic.
> > >
> > > This is called holding a spin lock, another task cannot enter here?
> > > Neither can direct-IO, because it is also locked out. The bit helps DIO
> > > code to avoid trying to do parallel DIO without the need to take a spin
> > > lock. When DIO decides it wants to do parallel IO, it first has to get
> > > past fi->iocachectr < 0 - if there is another task trying to do cache
> > > IO, either DIO gets < 0 first and the other cache task has to wait, or
> > > cache tasks gets > 0 and dio will continue with the exclusive lock. Or
> > > do I miss something?
> >
> > Now I see what you mean, there is an unlock and another task might have also already set the bit
> >
> > I think this should do
> >
> > diff --git a/fs/fuse/iomode.c b/fs/fuse/iomode.c
> > index acd0833ae873..7c22edd674cb 100644
> > --- a/fs/fuse/iomode.c
> > +++ b/fs/fuse/iomode.c
> > @@ -41,6 +41,8 @@ static int fuse_inode_get_io_cache(struct fuse_inode *fi)
> >                 err = wait_event_killable(fi->direct_io_waitq,
> >                                           !fuse_is_io_cache_wait(fi));
> >                 spin_lock(&fi->lock);
> > +               if (!err)
> > +                       /* Another interrupted task might have unset it */
> > +                       set_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
> >         }
> >         /*
> >          * Enter caching mode or clear the FUSE_I_CACHE_IO_MODE bit if we
>
> I think this race can happen even if we remove killable_
> not sure - anyway, with fuse passthrough there is another error
> condition:
>
>         /*
>          * Check if inode entered passthrough io mode while waiting for parallel
>          * dio write completion.
>          */
>         if (fuse_inode_backing(fi))
>                 err = -ETXTBSY;
>
> But in this condition, all waiting tasks should abort the wait,
> so it does not seem a problem to clean the flag.
>
> Anyway, IMO it is better to set the flag before every wait and on
> success. Like below.
>
> Thanks,
> Amir.
>
> --- a/fs/fuse/iomode.c
> +++ b/fs/fuse/iomode.c
> @@ -35,8 +35,8 @@ static int fuse_inode_get_io_cache(struct fuse_inode *fi)
>          * Setting the bit advises new direct-io writes to use an exclusive
>          * lock - without it the wait below might be forever.
>          */
> -       set_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
>         while (!err && fuse_is_io_cache_wait(fi)) {
> +               set_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
>                 spin_unlock(&fi->lock);
>                 err = wait_event_killable(fi->direct_io_waitq,
>                                           !fuse_is_io_cache_wait(fi));
> @@ -53,8 +53,8 @@ static int fuse_inode_get_io_cache(struct fuse_inode *fi)
>          * Enter caching mode or clear the FUSE_I_CACHE_IO_MODE bit if we
>          * failed to enter caching mode and no other caching open exists.
>          */
> -       if (!err)
> -               fi->iocachectr++;
> +       if (!err && fi->iocachectr++ == 0)
> +               set_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
>         else if (fi->iocachectr <= 0)
>                 clear_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
>         return err;

Oops should be:

       if (!err) {
               fi->iocachectr++;
               set_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
       } else if (fi->iocachectr <= 0) {
               clear_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
       }
       return err;