On Tue, Sep 29, 2020 at 02:50:15PM -0400, Vivek Goyal wrote: > Following commit added a flag to invalidate guest page cache automatically. > > 72d0d248ca823 fuse: add FUSE_AUTO_INVAL_DATA init flag > > Idea seemed to be that for network file systmes if client A modifies > the file, then client B should be able to detect that mtime of file > change and invalidate its own cache and fetch new data from server. > > There are few questions/issues with this method. > > How soon client B able to detect that file has changed. Should it > first GETATTR from server for every READ and compare mtime. That > will be much stronger cache coherency but very slow because every > READ will first be preceeded by a GETATTR. > > Or should this be driven by inode timeout. That is if inode cached attrs > (including mtime) have timed out, we fetch new mtime from server and > invalidate cache based on that. > > Current logic calls fuse_update_attr() on every READ. But that method > will result in GETATTR only if either attrs have timedout or if cached > attrs have been invalidated. > > If client B is only doing READs (and not WRITEs), then attrs should be > valid for inode timeout interval. And that means client B will detect > mtime change only after timeout interval. > > But if client B is also doing WRITE, then once WRITE completes, we > invalidate cached attrs. That means next READ will force GETATTR() > and invalidate page cache. In this case client B will detect the > change by client A much sooner but it can't differentiate between > its own WRITEs and by another client WRITE. So every WRITE followed > by READ will result in GETATTR, followed by page cache invalidation > and performance suffers in mixed read/write workloads. > > I am assuming that intent of auto_inval_data is to detect changes > by another client but it can take up to "inode timeout" seconds > to detect that change. (And it does not guarantee an immidiate change > detection). > > If above assumption is acceptable, then I am proposing this patch > which will update attrs on READ only if attrs have timed out. This > means every second we will do a GETATTR and invalidate page cache. > > This is also suboptimal because only if client B is writing, our > cache is still valid but we will still invalidate it after 1 second. > But we don't have a good mechanism to differentiate between our own > changes and another client's changes. So this is probably second > best method to reduce the extent of issue. > > I am running equivalent of following fio workload on virtiofs (cache=auto) > and there I see a performance improvement of roughly 12%. > > fio --direct=1 --gtod_reduce=1 --name=test --filename=random_read_write.fio > +--bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75 > +--output=/output/fio.txt > > NAME WORKLOAD Bandwidth IOPS > vtfs-auto-sh randrw-psync 43.3mb/14.4mb 10.8k/3709 > vtfs-auto-sh-invaltime randrw-psync 48.9mb/16.3mb 12.2k/4197 > > Signee-off-by: Vivek Goyal <vgoyal@xxxxxxxxxx> > --- > fs/fuse/dir.c | 6 ++++++ > fs/fuse/file.c | 21 +++++++++++++++------ > fs/fuse/fuse_i.h | 1 + > 3 files changed, 22 insertions(+), 6 deletions(-) Reviewed-by: Stefan Hajnoczi <stefanha@xxxxxxxxxx>
Attachment:
signature.asc
Description: PGP signature