Hi,
On 12-08-19 16:17, Christoph Hellwig wrote:
<snip>
The problem is that the IPC to the host which we build upon only offers
regular read / write calls. So the most consistent (also cache coherent)
mapping which we can offer is to directly mapping read -> read and
wrtie->write without the pagecache. Ideally we would be able to just
say sorry cannot do mmap, but too much apps rely on mmap and the
out of tree driver has this mmap "emulation" which means not offering
it in the mainline version would be a serious regression.
In essence this is the same situation as a bunch of network filesystems
are in and I've looked at several for inspiration. Looking again at
e.g. v9fs_file_write_iter it does similar regular read -> read mapping
with invalidation of the page-cache for mmap users.
v9 is probably not a good idea to copy in general. While not the best
idea to copy directly either I'd rather look at nfs - that is another
protocol without a real distributed lock manager, but at least the
NFS close to open semantics are reasonably well defined and allow using
the pagecache.
Ok, I've been taking a quick peek at always using the page-cache for
writes, like NFS is doing.
One scenario here which I still have questions about is normal write
syscalls on a file opened in append mode. Currently I'm relying on
passing through the append flag to the host while opening the file.
This is fine for address_space_operations.write_end which AFAICT will be used
in case of implementing the write_iter callback through generic_perform_write,
this is fine for write_end since in write_end I have access to file->private_data
and thus to the IPC handle representing the open call with the append flag set,
so I do not need to worry about the host having changed the file underneath
us, since the host will make sure the write gets appended itself.
But what about address_space_operations.writepage? I guess this will never
get called as the result of a write call on a file with the append flag set,
right ? So I should have at least one handle around in the list of open
handles for the inode, which does not have the append flag set, so which I
can safely use to writeback dirty pages coming in through writepage(), right ?
Hmm, looking at my current vboxsf writepage code I see that I already only allow
using handles which were opened without the append flag, so I'm pretty sure
that I got this right, still if you can confirm that I've got this right,
that would be great.
And mmap of a file with the append flag set is not supported, so we should
be good there.
Regards,
Hans