Re: [PATCH v3 9/9] fuse: allow parallel dio writes with FUSE_DIRECT_IO_ALLOW_MMAP

Bernd Schubert <bernd.schubert@xxxxxxxxxxx> · Fri, 9 Feb 2024 13:19:23 +0100

On 2/9/24 13:12, Amir Goldstein wrote:
> On Fri, Feb 9, 2024 at 1:48 PM Bernd Schubert
> <bernd.schubert@xxxxxxxxxxx> wrote:
>>
>>
>>
>> On 2/9/24 12:21, Bernd Schubert wrote:
>>>
>>>
>>> On 2/9/24 11:50, Miklos Szeredi wrote:
>>>> On Thu, 8 Feb 2024 at 18:09, Amir Goldstein <amir73il@xxxxxxxxx> wrote:
>>>>
>>>>>  static int fuse_inode_get_io_cache(struct fuse_inode *fi)
>>>>>  {
>>>>> +       int err = 0;
>>>>> +
>>>>>         assert_spin_locked(&fi->lock);
>>>>> -       if (fi->iocachectr < 0)
>>>>> -               return -ETXTBSY;
>>>>> -       if (fi->iocachectr++ == 0)
>>>>> -               set_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
>>>>> -       return 0;
>>>>> +       /*
>>>>> +        * Setting the bit advises new direct-io writes to use an exclusive
>>>>> +        * lock - without it the wait below might be forever.
>>>>> +        */
>>>>> +       set_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
>>>>> +       while (!err && fuse_is_io_cache_wait(fi)) {
>>>>> +               spin_unlock(&fi->lock);
>>>>> +               err = wait_event_killable(fi->direct_io_waitq,
>>>>> +                                         !fuse_is_io_cache_wait(fi));
>>>>> +               spin_lock(&fi->lock);
>>>>> +       }
>>>>> +       /*
>>>>> +        * Enter caching mode or clear the FUSE_I_CACHE_IO_MODE bit if we
>>>>> +        * failed to enter caching mode and no other caching open exists.
>>>>> +        */
>>>>> +       if (!err)
>>>>> +               fi->iocachectr++;
>>>>> +       else if (fi->iocachectr <= 0)
>>>>> +               clear_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
>>>>
>>>> This seems wrong:  if the current task is killed, and there's anther
>>>> task trying to get cached open mode, then clearing
>>>> FUSE_I_CACHE_IO_MODE will allow new parallel writes, breaking this
>>>> logic.
>>>
>>> This is called holding a spin lock, another task cannot enter here?
>>> Neither can direct-IO, because it is also locked out. The bit helps DIO
>>> code to avoid trying to do parallel DIO without the need to take a spin
>>> lock. When DIO decides it wants to do parallel IO, it first has to get
>>> past fi->iocachectr < 0 - if there is another task trying to do cache
>>> IO, either DIO gets < 0 first and the other cache task has to wait, or
>>> cache tasks gets > 0 and dio will continue with the exclusive lock. Or
>>> do I miss something?
>>
>> Now I see what you mean, there is an unlock and another task might have also already set the bit
>>
>> I think this should do
>>
>> diff --git a/fs/fuse/iomode.c b/fs/fuse/iomode.c
>> index acd0833ae873..7c22edd674cb 100644
>> --- a/fs/fuse/iomode.c
>> +++ b/fs/fuse/iomode.c
>> @@ -41,6 +41,8 @@ static int fuse_inode_get_io_cache(struct fuse_inode *fi)
>>                 err = wait_event_killable(fi->direct_io_waitq,
>>                                           !fuse_is_io_cache_wait(fi));
>>                 spin_lock(&fi->lock);
>> +               if (!err)
>> +                       /* Another interrupted task might have unset it */
>> +                       set_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
>>         }
>>         /*
>>          * Enter caching mode or clear the FUSE_I_CACHE_IO_MODE bit if we
> 
> I think this race can happen even if we remove killable_
> not sure - anyway, with fuse passthrough there is another error
> condition:
> 
>         /*
>          * Check if inode entered passthrough io mode while waiting for parallel
>          * dio write completion.
>          */
>         if (fuse_inode_backing(fi))
>                 err = -ETXTBSY;
> 
> But in this condition, all waiting tasks should abort the wait,
> so it does not seem a problem to clean the flag.
> 
> Anyway, IMO it is better to set the flag before every wait and on
> success. Like below.
> 
> Thanks,
> Amir.
> 
> --- a/fs/fuse/iomode.c
> +++ b/fs/fuse/iomode.c
> @@ -35,8 +35,8 @@ static int fuse_inode_get_io_cache(struct fuse_inode *fi)
>          * Setting the bit advises new direct-io writes to use an exclusive
>          * lock - without it the wait below might be forever.
>          */
> -       set_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
>         while (!err && fuse_is_io_cache_wait(fi)) {
> +               set_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
>                 spin_unlock(&fi->lock);
>                 err = wait_event_killable(fi->direct_io_waitq,
>                                           !fuse_is_io_cache_wait(fi));
> @@ -53,8 +53,8 @@ static int fuse_inode_get_io_cache(struct fuse_inode *fi)
>          * Enter caching mode or clear the FUSE_I_CACHE_IO_MODE bit if we
>          * failed to enter caching mode and no other caching open exists.
>          */
> -       if (!err)
> -               fi->iocachectr++;
> +       if (!err && fi->iocachectr++ == 0)
> +               set_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
>         else if (fi->iocachectr <= 0)
>                 clear_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
>         return err;

Yeah, that is fine with me. Basically every wait signals "please let me
in", instead of doing it a single time only.

Thanks,
Bernd