On 11/29/23 22:39, Antonio SJ Musumeci wrote:
On 11/29/23 14:46, Bernd Schubert wrote:
On 11/29/23 18:39, Amir Goldstein wrote:
On Wed, Nov 29, 2023 at 6:55 PM Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
On Wed, 29 Nov 2023 at 16:52, Amir Goldstein <amir73il@xxxxxxxxx> wrote:
direct I/O read()/write() is never a problem.
The question is whether mmap() on a file opened with FOPEN_DIRECT_IO
when the inode is in passthrough mode, also uses fuse_passthrough_mmap()?
I think it should.
or denied, similar to how mmap with ff->open_flags & FOPEN_DIRECT_IO &&
vma->vm_flags & VM_MAYSHARE) && !fc->direct_io_relax
is denied?
What would be the use case for FOPEN_DIRECT_IO with passthrough mmap?
I don't have a use case. That's why I was wondering if we should
support it at all, but will try to make it work.
What is actually the use case for FOPEN_DIRECT_IO and passthrough?
Avoiding double page cache?
A bit more challenging, because we will need to track unmounts, or at
least track
"was_cached_mmaped" state per file, but doable.
Tracking unmaps via fuse_vma_close() should not be difficult.
OK. so any existing mmap, whether on FOPEN_DIRECT_IO or not
always prevents an inode from being "neutral".
Thanks,
Bernd
> Avoiding double page cache?
Currently my users enable direct_io because 1) it is typically
materially faster than not 2) avoiding double page caching (it is a
union filesystem).
3) You want coherency for network file systems (our use case).
So performance kind of means it is preferred to have it enabled for
passthrough. And with that MAP_SHARED gets rather important, imho. I
don't know if recent gcc versions still do it, but gcc used to write
files using MAP_SHARED. In the HPC AI world python tools also tend to do
IO with MAP_SHARED.
The only real reason people disable direct_io is because many apps need
shared mmap. I've implemented a mode to lookup details about the
requesting app and optionally disable direct_io for apps which are known
to need shared mmap but that isn't ideal. The relaxed mode being
discussed would likely be more performant and more transparent to the
user. That transparency is nice if that can continue as it is already
pretty difficult to explain all these options to the layman.
Offtopic: What happens in passthrough mode when an error occurs?
Currently I have a number of behaviors relying on the fact that I can
intercept and respond to errors. I think my users will gladly give them
up for near native io perf but if they can get both it would be nice.