On 2/9/24 13:12, Amir Goldstein wrote: > On Fri, Feb 9, 2024 at 1:48 PM Bernd Schubert > <bernd.schubert@xxxxxxxxxxx> wrote: >> >> >> >> On 2/9/24 12:21, Bernd Schubert wrote: >>> >>> >>> On 2/9/24 11:50, Miklos Szeredi wrote: >>>> On Thu, 8 Feb 2024 at 18:09, Amir Goldstein <amir73il@xxxxxxxxx> wrote: >>>> >>>>> static int fuse_inode_get_io_cache(struct fuse_inode *fi) >>>>> { >>>>> + int err = 0; >>>>> + >>>>> assert_spin_locked(&fi->lock); >>>>> - if (fi->iocachectr < 0) >>>>> - return -ETXTBSY; >>>>> - if (fi->iocachectr++ == 0) >>>>> - set_bit(FUSE_I_CACHE_IO_MODE, &fi->state); >>>>> - return 0; >>>>> + /* >>>>> + * Setting the bit advises new direct-io writes to use an exclusive >>>>> + * lock - without it the wait below might be forever. >>>>> + */ >>>>> + set_bit(FUSE_I_CACHE_IO_MODE, &fi->state); >>>>> + while (!err && fuse_is_io_cache_wait(fi)) { >>>>> + spin_unlock(&fi->lock); >>>>> + err = wait_event_killable(fi->direct_io_waitq, >>>>> + !fuse_is_io_cache_wait(fi)); >>>>> + spin_lock(&fi->lock); >>>>> + } >>>>> + /* >>>>> + * Enter caching mode or clear the FUSE_I_CACHE_IO_MODE bit if we >>>>> + * failed to enter caching mode and no other caching open exists. >>>>> + */ >>>>> + if (!err) >>>>> + fi->iocachectr++; >>>>> + else if (fi->iocachectr <= 0) >>>>> + clear_bit(FUSE_I_CACHE_IO_MODE, &fi->state); >>>> >>>> This seems wrong: if the current task is killed, and there's anther >>>> task trying to get cached open mode, then clearing >>>> FUSE_I_CACHE_IO_MODE will allow new parallel writes, breaking this >>>> logic. >>> >>> This is called holding a spin lock, another task cannot enter here? >>> Neither can direct-IO, because it is also locked out. The bit helps DIO >>> code to avoid trying to do parallel DIO without the need to take a spin >>> lock. When DIO decides it wants to do parallel IO, it first has to get >>> past fi->iocachectr < 0 - if there is another task trying to do cache >>> IO, either DIO gets < 0 first and the other cache task has to wait, or >>> cache tasks gets > 0 and dio will continue with the exclusive lock. Or >>> do I miss something? >> >> Now I see what you mean, there is an unlock and another task might have also already set the bit >> >> I think this should do >> >> diff --git a/fs/fuse/iomode.c b/fs/fuse/iomode.c >> index acd0833ae873..7c22edd674cb 100644 >> --- a/fs/fuse/iomode.c >> +++ b/fs/fuse/iomode.c >> @@ -41,6 +41,8 @@ static int fuse_inode_get_io_cache(struct fuse_inode *fi) >> err = wait_event_killable(fi->direct_io_waitq, >> !fuse_is_io_cache_wait(fi)); >> spin_lock(&fi->lock); >> + if (!err) >> + /* Another interrupted task might have unset it */ >> + set_bit(FUSE_I_CACHE_IO_MODE, &fi->state); >> } >> /* >> * Enter caching mode or clear the FUSE_I_CACHE_IO_MODE bit if we > > I think this race can happen even if we remove killable_ > not sure - anyway, with fuse passthrough there is another error > condition: > > /* > * Check if inode entered passthrough io mode while waiting for parallel > * dio write completion. > */ > if (fuse_inode_backing(fi)) > err = -ETXTBSY; > > But in this condition, all waiting tasks should abort the wait, > so it does not seem a problem to clean the flag. > > Anyway, IMO it is better to set the flag before every wait and on > success. Like below. > > Thanks, > Amir. > > --- a/fs/fuse/iomode.c > +++ b/fs/fuse/iomode.c > @@ -35,8 +35,8 @@ static int fuse_inode_get_io_cache(struct fuse_inode *fi) > * Setting the bit advises new direct-io writes to use an exclusive > * lock - without it the wait below might be forever. > */ > - set_bit(FUSE_I_CACHE_IO_MODE, &fi->state); > while (!err && fuse_is_io_cache_wait(fi)) { > + set_bit(FUSE_I_CACHE_IO_MODE, &fi->state); > spin_unlock(&fi->lock); > err = wait_event_killable(fi->direct_io_waitq, > !fuse_is_io_cache_wait(fi)); > @@ -53,8 +53,8 @@ static int fuse_inode_get_io_cache(struct fuse_inode *fi) > * Enter caching mode or clear the FUSE_I_CACHE_IO_MODE bit if we > * failed to enter caching mode and no other caching open exists. > */ > - if (!err) > - fi->iocachectr++; > + if (!err && fi->iocachectr++ == 0) > + set_bit(FUSE_I_CACHE_IO_MODE, &fi->state); > else if (fi->iocachectr <= 0) > clear_bit(FUSE_I_CACHE_IO_MODE, &fi->state); > return err; Yeah, that is fine with me. Basically every wait signals "please let me in", instead of doing it a single time only. Thanks, Bernd