> > Here is what I was thinking about: > > > > https://github.com/amir73il/linux/commits/fuse_io_mode > > > > The concept that I wanted to introduce was the > > fuse_inode_deny_io_cache()/fuse_inode_allow_io_cache() > > helpers (akin to deny_write_access()/allow_write_access()). > > > > In this patch, parallel dio in progress deny open in caching mode > > and mmap, and I don't know if that is acceptable. > > Technically, instead of deny open/mmap you can use additional > > techniques to wait for in progress dio and allow caching open/mmap. > > > > Anyway, I plan to use the iocachectr and fuse_inode_deny_io_cache() > > pattern when file is open in FOPEN_PASSTHROUGH mode, but > > in this case, as agreed with Miklos, a server trying to mix open > > in caching mode on the same inode is going to fail the open. > > > > mmap is less of a problem for inode in passthrough mode, because > > mmap in of direct_io file and inode in passthrough mode is passthrough > > mmap to backing file. > > > > Anyway, if you can use this patch or parts of it, be my guest and if you > > want to use a different approach that is fine by me as well - in that case > > I will just remove the fuse_file_shared_dio_{start,end}() part from my patch. > > Hi Amir, > > here is my fuse-dio-v5 branch: > https://github.com/bsbernd/linux/commits/fuse-dio-v5/ > > (v5 is just compilation tested, tests are running now over night) This looks very nice! I left comments about some minor nits on github. > > This branch is basically about consolidating fuse write direct IO code > paths and to allow a shared lock for O_DIRECT. I actually could have > noticed the page cache issue with shared locks before with previous > versions of these patches, just my VM kernel is optimized for > compilation time and some SHM options had been missing - with that fio > refused to run. > > The branch includes a modified version of your patch: > https://github.com/bsbernd/linux/commit/6b05e52f7e253d9347d97de675b21b1707d6456e > > Main changes are > - fuse_file_io_open() does not set the FOPEN_CACHE_IO flag for > file->f_flags & O_DIRECT > - fuse_file_io_mmap() waits on a dio waitq > - fuse_file_shared_dio_start / fuse_file_shared_dio_end are moved up in > the file, as I would like to entirely remove the fuse_direct_write iter > function (all goes through cache_write_iter) > Looks mostly good, but I think that fuse_file_shared_dio_start() => fuse_inode_deny_io_cache() should actually be done after taking the inode lock (shared or exclusive) and not like in my patch. First of all, this comment in fuse_dio_wr_exclusive_lock(): /* * fuse_file_shared_dio_start() must not be called on retest, * as it decreases a counter value - must not be done twice */ if (!fuse_file_shared_dio_start(inode)) return true; ...is suggesting that semantics are not clean and this check must remain last, because if fuse_dio_wr_exclusive_lock() returns false, iocachectr must not be elevated. This is easy to get wrong in the future with current semantics. The more important thing is that while fuse_file_io_mmap() is waiting for iocachectr to drop to zero, new parallel dio can come in and starve the mmap() caller forever. I think that we are going to need to use some inode state flag (e.g. FUSE_I_DIO_WR_EXCL) to protect against this starvation, unless we do not care about this possibility? We'd only need to set this in fuse_file_io_mmap() until we get the iocachectr refcount. I *think* that fuse_inode_deny_io_cache() should be called with shared inode lock held, because of the existing lock chain i_rwsem -> page lock -> mmap_lock for page faults, but I am not sure. My brain is too cooked now to figure this out. OTOH, I don't see any problem with calling fuse_inode_deny_io_cache() with shared lock held? I pushed this version to my fuse_io_mode branch [1]. Only tested generic/095 with FOPEN_DIRECT_IO and DIRECT_IO_ALLOW_MMAP. Thanks, Amir. [1] https://github.com/amir73il/linux/commits/fuse_io_mode