On Tue, Dec 5, 2023 at 1:42 AM Bernd Schubert <bernd.schubert@xxxxxxxxxxx> wrote: > > > > On 12/4/23 11:04, Bernd Schubert wrote: > > > > > > On 12/4/23 10:27, Miklos Szeredi wrote: > >> On Mon, 4 Dec 2023 at 07:50, Amir Goldstein <amir73il@xxxxxxxxx> wrote: > >>> > >>> On Mon, Dec 4, 2023 at 1:00 AM Bernd Schubert > >>> <bernd.schubert@xxxxxxxxxxx> wrote: > >>>> > >>>> Hi Amir, > >>>> > >>>> On 12/3/23 12:20, Amir Goldstein wrote: > >>>>> On Sat, Dec 2, 2023 at 5:06 PM Amir Goldstein <amir73il@xxxxxxxxx> > >>>>> wrote: > >>>>>> > >>>>>> On Mon, Nov 6, 2023 at 4:08 PM Bernd Schubert > >>>>>> <bernd.schubert@xxxxxxxxxxx> wrote: > >>>>>>> > >>>>>>> Hi Miklos, > >>>>>>> > >>>>>>> On 9/20/23 10:15, Miklos Szeredi wrote: > >>>>>>>> On Wed, 20 Sept 2023 at 04:41, Tyler Fanelli > >>>>>>>> <tfanelli@xxxxxxxxxx> wrote: > >>>>>>>>> > >>>>>>>>> At the moment, FUSE_INIT's DIRECT_IO_RELAX flag only serves the > >>>>>>>>> purpose > >>>>>>>>> of allowing shared mmap of files opened/created with DIRECT_IO > >>>>>>>>> enabled. > >>>>>>>>> However, it leaves open the possibility of further relaxing the > >>>>>>>>> DIRECT_IO restrictions (and in-effect, the cache coherency > >>>>>>>>> guarantees of > >>>>>>>>> DIRECT_IO) in the future. > >>>>>>>>> > >>>>>>>>> The DIRECT_IO_ALLOW_MMAP flag leaves no ambiguity of its > >>>>>>>>> purpose. It > >>>>>>>>> only serves to allow shared mmap of DIRECT_IO files, while still > >>>>>>>>> bypassing the cache on regular reads and writes. The shared > >>>>>>>>> mmap is the > >>>>>>>>> only loosening of the cache policy that can take place with the > >>>>>>>>> flag. > >>>>>>>>> This removes some ambiguity and introduces a more stable flag > >>>>>>>>> to be used > >>>>>>>>> in FUSE_INIT. Furthermore, we can document that to allow shared > >>>>>>>>> mmap'ing > >>>>>>>>> of DIRECT_IO files, a user must enable DIRECT_IO_ALLOW_MMAP. > >>>>>>>>> > >>>>>>>>> Tyler Fanelli (2): > >>>>>>>>> fs/fuse: Rename DIRECT_IO_RELAX to DIRECT_IO_ALLOW_MMAP > >>>>>>>>> docs/fuse-io: Document the usage of DIRECT_IO_ALLOW_MMAP > >>>>>>>> > >>>>>>>> Looks good. > >>>>>>>> > >>>>>>>> Applied, thanks. Will send the PR during this merge window, > >>>>>>>> since the > >>>>>>>> rename could break stuff if already released. > >>>>>>> > >>>>>>> I'm just porting back this feature to our internal fuse module > >>>>>>> and it > >>>>>>> looks these rename patches have been forgotten? > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> Hi Miklos, Bernd, > >>>>>> > >>>>>> I was looking at the DIRECT_IO_ALLOW_MMAP code and specifically at > >>>>>> commit b5a2a3a0b776 ("fuse: write back dirty pages before direct > >>>>>> write in > >>>>>> direct_io_relax mode") and I was wondering - isn't dirty pages > >>>>>> writeback > >>>>>> needed *before* invalidate_inode_pages2() in fuse_file_mmap() for > >>>>>> direct_io_allow_mmap case? > >>>>>> > >>>>>> For FUSE_PASSTHROUGH, I am going to need to call fuse_vma_close() > >>>>>> for munmap of files also in direct-io mode [1], so I was > >>>>>> considering installing > >>>>>> fuse_file_vm_ops for the FOPEN_DIRECT_IO case, same as caching case, > >>>>>> and regardless of direct_io_allow_mmap. > >>>>>> > >>>>>> I was asking myself if there was a good reason why > >>>>>> fuse_page_mkwrite()/ > >>>>>> fuse_wait_on_page_writeback()/fuse_vma_close()/write_inode_now() > >>>>>> should NOT be called for the FOPEN_DIRECT_IO case regardless of > >>>>>> direct_io_allow_mmap? > >>>>>> > >>>>> > >>>>> Before trying to make changes to fuse_file_mmap() I tried to test > >>>>> DIRECT_IO_RELAX - I enabled it in libfuse and ran fstest with > >>>>> passthrough_hp --direct-io. > >>>>> > >>>>> The test generic/095 - "Concurrent mixed I/O (buffer I/O, aiodio, > >>>>> mmap, splice) > >>>>> on the same files" blew up hitting BUG_ON(fi->writectr < 0) in > >>>>> fuse_set_nowrite() > >>>>> > >>>>> I am wondering how this code was tested? > >>>>> > >>>>> I could not figure out the problem and how to fix it. > >>>>> Please suggest a fix and let me know which adjustments are needed > >>>>> if I want to use fuse_file_vm_ops for all mmap modes. > >>>> > >>>> So fuse_set_nowrite() tests for inode_is_locked(), but that also > >>>> succeeds for a shared lock. It gets late here (and I might miss > >>>> something), but I think we have an issue with > >>>> FOPEN_PARALLEL_DIRECT_WRITES. Assuming there would be plain O_DIRECT > >>>> and > >>>> mmap, the same issue might triggered? Hmm, well, so far plain O_DIRECT > >>>> does not support FOPEN_PARALLEL_DIRECT_WRITES yet - the patches for > >>>> that > >>>> are still pending. > >>>> > >>> > >>> Your analysis seems to be correct. > >>> > >>> Attached patch fixes the problem and should be backported to 6.6.y. > >>> > >>> Miklos, > >>> > >>> I prepared the patch on top of master and not on top of the rename to > >>> FUSE_DIRECT_IO_ALLOW_MMAP in for-next for ease of backport to > >>> 6.6.y, although if you are planning send the flag rename to v6.7 as a > >>> fix, > >>> you may prefer to apply the fix after the rename and request to backport > >>> the flag rename along with the fix to 6.6.y. > >> > >> I've done that. Thanks for the fix and testing. > > > > Hi Amir, hi Miklos, > > > > could you please hold on a bit before sending the patch upstream? > > I think we can just test for fuse_range_is_writeback in > > fuse_direct_write_iter. I will have a patch in a few minutes. > > Hmm, that actually doesn't work as we would need to hold the inode lock > in page write functions. > Then tried to do it per inode and only when the inode gets cached writes > or mmap - this triggers a lockdep lock order warning, because > fuse_file_mmap is called with mm->mmap_lock and would take the inode > lock. But through > fuse_direct_io/iov_iter_get_pages2/__iov_iter_get_pages_alloc these > locks are taken the other way around. > So right now I don't see a way out - we need to go with Amirs patch first. > > Is it actually important for FUSE_DIRECT_IO_ALLOW_MMAP fs (e.g. virtiofsd) to support FOPEN_PARALLEL_DIRECT_WRITES? I guess not otherwise, the combination would have been tested. FOPEN_PARALLEL_DIRECT_WRITES is typically important for network fs and FUSE_DIRECT_IO_ALLOW_MMAP is typically not for network fs. Right? FWIW, with FUSE_PASSTHROUGH, I plan that a shared mmap of an inode in "passthrough mode" (i.e. has an open FOPEN_PASSTHROUGH file) will be allowed (maps the backing file) regardless of fc->direct_io_allow_mmap. FOPEN_PARALLEL_DIRECT_WRITES will also be allowed on an inode in "passthrough mode", because an inode in "passthrough mode" cannot have any pending page cache writes. This makes me realize that I will also need to handle passthrough of ->direct_IO() on an FOPEN_PASSTHROUGH file. Thanks, Amir.