Re: [PATCH v2 5/5] fuse: introduce inode io modes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 1, 2024 at 7:53 PM Bernd Schubert <bschubert@xxxxxxx> wrote:
>
> On 2/1/24 17:33, Amir Goldstein wrote:
> > On Thu, Feb 1, 2024 at 4:47 PM Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
> >>
> >> On Thu, 1 Feb 2024 at 00:09, Bernd Schubert <bschubert@xxxxxxx> wrote:
> >>>
> >>> From: Amir Goldstein <amir73il@xxxxxxxxx>
> >>>
> >>> The fuse inode io mode is determined by the mode of its open files/mmaps
> >>> and parallel dio.
> >>>
> >>> - caching io mode - files open in caching mode or mmap on direct_io file
> >>> - direct io mode - no files open in caching mode and no files mmaped
> >>> - parallel dio mode - direct io mode with parallel dio in progress
> >>
> >> Specifically if iocachectr is:
> >>
> >>> 0 -> caching io
> >> == 0 -> direct io
> >> < 0 -> parallel io
> >>
> >>>
> >>> We use a new FOPEN_CACHE_IO flag to explicitly mark a file that was open
> >>> in caching mode.
> >>
> >> This is really confusing.  FOPEN_CACHE_IO is apparently an internally
> >> used flag, but it's defined on the userspace API.
> >>
> >> a) what is the meaning of this flag on the external API?
> >> b) what is the purpose of this flag internally?
> >
> > The purpose is to annotate the state of direct io file that was mmaped
> > as FOPEN_DIRECT_IO | FOPEN_CACHE_IO.
> > An fd like this puts inode in caching mode and its release may get inode
> > out of caching mode.
> >
> > I did not manage to do refcoutning with fuse_vma_close(), because those
> > calls are not balances with fuse_file_mmap() calls.
> >
> > The first mmap() of an FOPEN_DIRECT_IO file may incur wait for completion
> > of parallel dio.
> >
> > The only use of exporting FOPEN_CACHE_IO to the server is that it could
> > force incurring this wait at open() time instead of mmap() time.
> >
> > I also considered for servers that advertise FUSE_PASSTHROUGH
> > capability to not allow open without specifying an explicit io mode,
> > that is one of  FOPEN_DIRECT_IO | FOPEN_CACHE_IO |
> > FOPEN_PASSTHROUGH, but I did not actually implement those
> > semantics ATM.
> >
> >>
> >>>
> >>> direct_io mmap uses page cache, so first mmap will mark the file as
> >>> FOPEN_DIRECT_IO|FOPEN_CACHE_IO (i.e. mixed mode) and inode will enter
> >>> the caching io mode.
> >>>
> >>> If the server opens the file with flags FOPEN_DIRECT_IO|FOPEN_CACHE_IO,
> >>> the inode enters caching io mode already on open.
> >>>
> >>> This allows executing parallel dio when inode is not in caching mode
> >>> even if shared mmap is allowed, but no mmaps have been performed on
> >>> the inode in question.
> >>>
> >>> An open in caching mode and mmap on direct_io file now waits for all
> >>> in-progress parallel dio writes to complete, so paralle dio writes
> >>> together with FUSE_DIRECT_IO_ALLOW_MMAP is enabled by this commit.
> >>
> >> I think the per file state is wrong, the above can be done with just
> >> the per-inode state.
> >>
> >
> > The per-file state is to taint the file is "has been mmaped".
> > I did not find another way of doing this.
> > Please suggest another way.
> >
> >>
> >>>
> >>> Signed-off-by: Bernd Schubert <bschubert@xxxxxxx>
> >>> Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx>
> >>> ---
> >>>   fs/fuse/file.c            | 215 ++++++++++++++++++++++++++++++++++++--
> >>>   fs/fuse/fuse_i.h          |  79 +++++++++++++-
> >>>   include/uapi/linux/fuse.h |   2 +
> >>>   3 files changed, 286 insertions(+), 10 deletions(-)
> >>>
> >>> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> >>> index 7d2f4b0eb36a..eb9929ff9f60 100644
> >>> --- a/fs/fuse/file.c
> >>> +++ b/fs/fuse/file.c
> >>> @@ -105,10 +105,177 @@ static void fuse_release_end(struct fuse_mount *fm, struct fuse_args *args,
> >>>          kfree(ra);
> >>>   }
> >>>
> >>> +static bool fuse_file_is_direct_io(struct file *file)
> >>> +{
> >>> +       struct fuse_file *ff = file->private_data;
> >>> +
> >>> +       return ff->open_flags & FOPEN_DIRECT_IO || file->f_flags & O_DIRECT;
> >>> +}
> >>
> >> This is one of the issues with the per-file state. O_DIRECT can be
> >> changed with fcntl(fd, F_SETFL, ...), so state calculated at open can
> >> go stale.
> >>
> >
> > Ouch!
> > Actually, it may not matter much if we just ignore O_DIRECT
> > completely for the purpose of io modes.
>
> I had also missed that fcntl option :/
>
> >
> > The io mode could be determined solely from FMODE_* flags.
> >
> > Worst case is that if the server opens an O_DIRECT file without
> > FOPEN_DIRECT_IO, then this inode cannot do parallel dio
> > and this inode cannot be opened in passthrough mode.
> > I don't think that is such a big problem.
> >
> > Bernd, what do you think?
>
> Yeah, currently FOPEN_PARALLEL_DIRECT_WRITES does not have an effect
> without FOPEN_DIRECT_IO. In my fuse-dio consolidation branch that
> changes. For me it is ok that parallel dio requires
> FOPEN_DIRECT_IO, easy enough to set on the server side in open methods
> when O_DIRECT is set. We should just document it in fuse.h.
>

Ok. I removed the O_DIRECT code from the last version I pushed.
Please verify that it does not break anything in your dio tests.

> >
> > BTW, server can and probably should open O_DIRECT files
> > with FOPEN_PASSTHROUGH and that means something
> > completely different than O_DIRECT && FOPEN_DIRECT_IO -
> > it means that io is passed through to the backing file without
> > doing buffered io on the backing file.
> >
> >>> +
> >>> +/*
> >>> + * Wait for cached io to be allowed -
> >>> + * Blocks new parallel dio writes and waits for the in-progress parallel dio
> >>> + * writes to complete.
> >>> + */
> >>> +static int fuse_inode_wait_for_cached_io(struct fuse_inode *fi)
> >>> +{
> >>> +       int err = 0;
> >>> +
> >>> +       assert_spin_locked(&fi->lock);
> >>> +
> >>> +       while (!err && !fuse_inode_get_io_cache(fi)) {
> >>> +               /*
> >>> +                * Setting the bit advises new direct-io writes
> >>> +                * to use an exclusive lock - without it the wait below
> >>> +                * might be forever.
> >>> +                */
> >>> +               set_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
> >>> +               spin_unlock(&fi->lock);
> >>> +               err = wait_event_killable(fi->direct_io_waitq,
> >>> +                                         fuse_is_io_cache_allowed(fi));
> >>> +               spin_lock(&fi->lock);
> >>> +       }
> >>> +       /* Clear FUSE_I_CACHE_IO_MODE flag if failed to enter caching mode */
> >>> +       if (err && fi->iocachectr <= 0)
> >>> +               clear_bit(FUSE_I_CACHE_IO_MODE, &fi->state);
> >>> +
> >>> +       return err;
> >>> +}
> >>
> >> I suggest moving all infrastructure, including the inline helpers in
> >> fuse_i.h into a separate source file.
> >>
> >
> > OK. I can make those changes.
> >

moved to iomode.c

> >>> +
> >>> +/* Start cached io mode where parallel dio writes are not allowed */
> >>> +static int fuse_file_cached_io_start(struct inode *inode)
> >>> +{
> >>> +       struct fuse_inode *fi = get_fuse_inode(inode);
> >>> +       int err;
> >>> +
> >>> +       spin_lock(&fi->lock);
> >>> +       err = fuse_inode_wait_for_cached_io(fi);
> >>> +       spin_unlock(&fi->lock);
> >>> +       return err;
> >>> +}
> >>> +
> >>> +static void fuse_file_cached_io_end(struct inode *inode)
> >>> +{
> >>> +       struct fuse_inode *fi = get_fuse_inode(inode);
> >>> +
> >>> +       spin_lock(&fi->lock);
> >>> +       fuse_inode_put_io_cache(get_fuse_inode(inode));
> >>> +       spin_unlock(&fi->lock);
> >>> +}
> >>> +
> >>> +/* Start strictly uncached io mode where cache access is not allowed */
> >>> +static int fuse_file_uncached_io_start(struct inode *inode)
> >>> +{
> >>> +       struct fuse_inode *fi = get_fuse_inode(inode);
> >>> +       bool ok;
> >>> +
> >>> +       spin_lock(&fi->lock);
> >>> +       ok = fuse_inode_deny_io_cache(fi);
> >>> +       spin_unlock(&fi->lock);
> >>> +       return ok ? 0 : -ETXTBSY;
> >>> +}
> >>> +
> >>> +static void fuse_file_uncached_io_end(struct inode *inode)
> >>> +{
> >>> +       struct fuse_inode *fi = get_fuse_inode(inode);
> >>> +       bool allow_cached_io;
> >>> +
> >>> +       spin_lock(&fi->lock);
> >>> +       allow_cached_io = fuse_inode_allow_io_cache(fi);
> >>> +       spin_unlock(&fi->lock);
> >>> +       if (allow_cached_io)
> >>> +               wake_up(&fi->direct_io_waitq);
> >>> +}
> >>> +
> >>> +/* Open flags to determine regular file io mode */
> >>> +#define FOPEN_IO_MODE_MASK \
> >>> +       (FOPEN_DIRECT_IO | FOPEN_CACHE_IO)
> >>> +
> >>> +/* Request access to submit new io to inode via open file */
> >>> +static int fuse_file_io_open(struct file *file, struct inode *inode)
> >>> +{
> >>> +       struct fuse_file *ff = file->private_data;
> >>> +       int iomode_flags = ff->open_flags & FOPEN_IO_MODE_MASK;
> >>> +       int err;
> >>> +
> >>> +       err = -EBUSY;
> >>> +       if (WARN_ON(ff->io_opened))
> >>> +               goto fail;
> >>> +
> >>> +       if (!S_ISREG(inode->i_mode) || FUSE_IS_DAX(inode)) {
> >>
> >> This S_ISREG check can also go away with separating the directory opens.
> >
> > OK. I will see if I can make that cleanup as well.
>

done.

Bernd,

I pushed branch with latest cleanup patches and addressed Miklos'
simpler review comments to:

https://github.com/amir73il/linux/commits/fuse_io_mode-020224/

The one thing I did not change is the FOPEN_CACHE_IO flag
and per-file state, because I have no idea how to fix this differently.
I can change the external flag to an internal state if that is the main
concern, but I had a feeling there was more to it.

If you get the chance, please review and re-run your tests.

Thanks,
Amir.





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux