Re: [PATCH 0/2] fuse: Rename DIRECT_IO_{RELAX -> ALLOW_MMAP}

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 12/13/23 15:09, Bernd Schubert wrote:


On 12/13/23 14:03, Bernd Schubert wrote:


On 12/13/23 12:23, Amir Goldstein wrote:

     Thanks Amir, I'm going to look at it in detail in the morning.
     Btw, there is another bad direct_io_allow_mmap issue (part of it is      invalidate_inode_pages2, which you already noticed, but not alone).
     Going to send out the patch once xfstests have passed
https://github.com/bsbernd/linux/commit/3dae6b05854c4fe84302889a5625c7e5428cdd6c <https://github.com/bsbernd/linux/commit/3dae6b05854c4fe84302889a5625c7e5428cdd6c>


Nice!
But I think that invalidate pages issue is not restricted to shared mmap?

So history for that is

commit 3121bfe7631126d1b13064855ac2cfa164381bb0
Author: Miklos Szeredi <mszeredi@xxxxxxx>
Date:   Thu Apr 9 17:37:53 2009 +0200

      fuse: fix "direct_io" private mmap

      MAP_PRIVATE mmap could return stale data from the cache for
      "direct_io" files.  Fix this by flushing the cache on mmap.

      Found with a slightly modified fsx-linux.

      Signed-off-by: Miklos Szeredi <mszeredi@xxxxxxx>

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 0946861b10b7..06f30e965676 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1298,6 +1298,8 @@ static int fuse_direct_mmap(struct file *file, struct vm_area_struct *vma)
          if (vma->vm_flags & VM_MAYSHARE)
                  return -ENODEV;

+       invalidate_inode_pages2(file->f_mapping);
+
          return generic_file_mmap(file, vma);
   }


I don't have a strong opinion here - so idea of this patch is to avoid
exposing stale data from a previous mmap. I guess (and probably hard to achieve semantics) would be to invalidate pages when the last mapping of that _area_
is done?
So now with a shared map, data are supposed to be stored in files and
close-to-open consistency with FOPEN_KEEP_CACHE should handle the invalidation?


Nevermind, it was just my bad understanding of invalidate_inode_pages2().
I think it calls fuse_launder_folio() for dirty pages, so data loss is
not a concern.


I think that the mix of direct io file with private mmap is common and
doesn't have issues, but the mix of direct io files and caching files on
the same inode is probably not very common has the same issues as the
direct_io_allow_mmap regression that you are fixing.

Yeah. I also find it interesting that generic_file_mmap is not doing such
things for files opened with O_DIRECT - FOPEN_DIRECT_IO tries to do
strong coherency?


I'm going to send out the patch for now as it is, as that might become a longer
discussion - maybe Miklos could comment on it.


I think your patch should not be avoiding invalidate_inode_pages2()
in the shared mmap case.

You have done that part because of my comment which was wrong,
not because it reproduced a bug.

I debating with myself since yesterday, where invalidate_inode_pages2() belongs to.

We have

FOPEN_KEEP_CACHE - if not set invalidate_inode_pages2 is done by fuse_open_common().  If set, server side signals that it wants to keep the cache. Also interesting, I don't see anything that prevents that FOPEN_DIRECT_IO and FOPEN_KEEP_CACHE are set together.
Also to consider, fc->_no_open sets FOPEN_KEEP_CACHE by default.

MAP_PRIVATE - here I'm sure that invalidate_inode_pages2 is right, even with FOPEN_KEEP_CACHE. There is also zero risk to lose data, as MAP_PRIVATE does not write out data.

MAP_SHARED - was not allowed with FOPEN_DIRECT_IO before. Unless FOPEN_KEEP_CACHE is set, close-to-open semantics come in. My argument to to avoid invalidate_inode_pages2 in the current patch is that MAP_SHARED wants to share data between processes. And also maybe important, there is no flush in that function - dirty pages would be thrown away - data corruption!?


Ah sorry, I had actually missed folio_launder() -> fuse_launder_folio(), ok fine then, we can keep it for both cases.


Hmm, actually, why is fuse_launder_folio only writing out one single page, instead of all pages in the folio? This actually applies to all file systems that have have .launder_folio method (nfs, orangefs, smb, fuse) with the exception of v9fs. The latter is the only that writes out the entire folio.


Thanks,
Bernd




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux