On Mon, 2019-02-18 at 16:40 +0100, Paul Emmerich wrote: > > A call into libcephfs from ganesha to retrieve cached attributes is > > mostly just in-memory copies within the same process, so any performance > > overhead there is pretty minimal. If we need to go to the network to get > > the attributes, then that was a case where the cache should have been > > invalidated anyway, and we avoid having to check the validity of the > > cache. > > I've benchmarked a ~15% performance difference in IOPS between cache > expiration time of 0 and 10 when running fio on a single file from a > single client. > > NFS iops? I'd guess more READ ops in particular? Is that with a FSAL_CEPH backend? > > > > > > On Thu, Feb 14, 2019 at 9:04 PM Jeff Layton <jlayton@xxxxxxxxxxxxxxx> wrote: > > > > On Thu, 2019-02-14 at 20:57 +0800, Marvin Zhang wrote: > > > > > Here is the copy from https://tools.ietf.org/html/rfc7530#page-40 > > > > > Will Client query 'change' attribute every time before reading to know > > > > > if the data has been changed? > > > > > > > > > > +-----------------+----+------------+-----+-------------------+ > > > > > | Name | ID | Data Type | Acc | Defined in | > > > > > +-----------------+----+------------+-----+-------------------+ > > > > > | supported_attrs | 0 | bitmap4 | R | Section 5.8.1.1 | > > > > > | type | 1 | nfs_ftype4 | R | Section 5.8.1.2 | > > > > > | fh_expire_type | 2 | uint32_t | R | Section 5.8.1.3 | > > > > > | change | 3 | changeid4 | R | Section 5.8.1.4 | > > > > > | size | 4 | uint64_t | R W | Section 5.8.1.5 | > > > > > | link_support | 5 | bool | R | Section 5.8.1.6 | > > > > > | symlink_support | 6 | bool | R | Section 5.8.1.7 | > > > > > | named_attr | 7 | bool | R | Section 5.8.1.8 | > > > > > | fsid | 8 | fsid4 | R | Section 5.8.1.9 | > > > > > | unique_handles | 9 | bool | R | Section 5.8.1.10 | > > > > > | lease_time | 10 | nfs_lease4 | R | Section 5.8.1.11 | > > > > > | rdattr_error | 11 | nfsstat4 | R | Section 5.8.1.12 | > > > > > | filehandle | 19 | nfs_fh4 | R | Section 5.8.1.13 | > > > > > +-----------------+----+------------+-----+-------------------+ > > > > > > > > > > > > > Not every time -- only when the cache needs revalidation. > > > > > > > > In the absence of a delegation, that happens on a timeout (see the > > > > acregmin/acregmax settings in nfs(5)), though things like opens and file > > > > locking events also affect when the client revalidates. > > > > > > > > When the v4 client does revalidate the cache, it relies heavily on NFSv4 > > > > change attribute. Cephfs's change attribute is cluster-coherent too, so > > > > if the client does revalidate it should see changes made on other > > > > servers. > > > > > > > > > On Thu, Feb 14, 2019 at 8:29 PM Jeff Layton <jlayton@xxxxxxxxxxxxxxx> wrote: > > > > > > On Thu, 2019-02-14 at 19:49 +0800, Marvin Zhang wrote: > > > > > > > Hi Jeff, > > > > > > > Another question is about Client Caching when disabling delegation. > > > > > > > I set breakpoint on nfs4_op_read, which is OP_READ process function in > > > > > > > nfs-ganesha. Then I read a file, I found that it will hit only once on > > > > > > > the first time, which means latter reading operation on this file will > > > > > > > not trigger OP_READ. It will read the data from client side cache. Is > > > > > > > it right? > > > > > > > > > > > > Yes. In the absence of a delegation, the client will periodically query > > > > > > for the inode attributes, and will serve reads from the cache if it > > > > > > looks like the file hasn't changed. > > > > > > > > > > > > > I also checked the nfs client code in linux kernel. Only > > > > > > > cache_validity is NFS_INO_INVALID_DATA, it will send OP_READ again, > > > > > > > like this: > > > > > > > if (nfsi->cache_validity & NFS_INO_INVALID_DATA) { > > > > > > > ret = nfs_invalidate_mapping(inode, mapping); > > > > > > > } > > > > > > > This about this senario, client1 connect ganesha1 and client2 connect > > > > > > > ganesha2. I read /1.txt on client1 and client1 will cache the data. > > > > > > > Then I modify this file on client2. At that time, how client1 know the > > > > > > > file is modifed and how it will add NFS_INO_INVALID_DATA into > > > > > > > cache_validity? > > > > > > > > > > > > Once you modify the code on client2, ganesha2 will request the necessary > > > > > > caps from the ceph MDS, and client1 will have its caps revoked. It'll > > > > > > then make the change. > > > > > > > > > > > > When client1 reads again it will issue a GETATTR against the file [1]. > > > > > > ganesha1 will then request caps to do the getattr, which will end up > > > > > > revoking ganesha2's caps. client1 will then see the change in attributes > > > > > > (the change attribute and mtime, most likely) and will invalidate the > > > > > > mapping, causing it do reissue a READ on the wire. > > > > > > > > > > > > [1]: There may be a window of time after you change the file on client2 > > > > > > where client1 doesn't see it. That's due to the fact that inode > > > > > > attributes on the client are only revalidated after a timeout. You may > > > > > > want to read over the DATA AND METADATA COHERENCE section of nfs(5) to > > > > > > make sure you understand how the NFS client validates its caches. > > > > > > > > > > > > Cheers, > > > > > > -- > > > > > > Jeff Layton <jlayton@xxxxxxxxxxxxxxx> > > > > > > > > > > > > > > -- > > > > Jeff Layton <jlayton@xxxxxxxxxxxxxxx> > > > > > > > > -- > > Jeff Layton <jlayton@xxxxxxxxxxxxxxx> > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jeff Layton <jlayton@xxxxxxxxxxxxxxx> _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com