On Fri, Jan 25, 2019 at 5:57 PM Miklos Szeredi <miklos@xxxxxxxxxx> wrote: > > On Fri, Jan 25, 2019 at 2:31 PM Miklos Szeredi <miklos@xxxxxxxxxx> wrote: > > > > On Fri, Jan 25, 2019 at 2:04 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > > > Mmm.... so now using upper page cache for write does sound so bad. > > > If only we had shared cache pages ;-) > > > We can treat upper page cache as a vessel for writeback and invalidate > > > it upon completion, since it is *ours*. > > > I'll see what I can come up with. > > > > Just calling ->writepage() of upper fs won't work, filesystems do > > preparatory work in ->prepare_write(), etc, and those need the struct > > file... I don't think we are better off, than calling > > vfs_write_iter(). > > With the cache on the overlay, I think we need to anchor the real file > in the overlay inode instead of the overlay file. That applies to > reads as well as writes. Not sure about lifetimes, obviously it's not > good to keep the real file open if there's no I/O being done. Some > heuristics will be needed, but first version could just keep the file > open until the inode is evicted. > I may be way off, but could you be over complicating this? We can call vfs_write_iter() from ovl_write_end() after data has been copied to overlay pages and while we still have a reference to realfile. Then we copy over the dirt from overlay page cache to upper page cache. When writeback comes along to overlay inode (maybe after close) we only need to writeback the already dirty pages of upper. Once not dirty, we can invalidate upper pages to reduce page cache usage. And what about readpage? why do you say that this applies to reads as well? Do we not have the file during readpage()? Besides, AFAIKT readahead() necessitates that ->readpage() can be called on an inode so we don't need more than the upper inode to for ovl_readpage(). The direct IO read is just an optimization, isn't it? I'll stick with the simple copy_page() implementation (back and forth) unless you come up with a better suggestion. We will probably want something more efficient later on, but I rather start with something simple and working. Thanks, Amir.