On Tue, Mar 12, 2019 at 9:46 AM Jeff Layton <jlayton@xxxxxxxxxxxxxxx> wrote: > > I have more questions about MDS caps. The File (F*) caps in cephfs are > very granular, such that it's not clear what extra ability each one > grants with respect to the others. Here's the list: > > #define CEPH_CAP_FILE_SHARED (CEPH_CAP_GSHARED << CEPH_CAP_SFILE) > #define CEPH_CAP_FILE_EXCL (CEPH_CAP_GEXCL << CEPH_CAP_SFILE) > #define CEPH_CAP_FILE_CACHE (CEPH_CAP_GCACHE << CEPH_CAP_SFILE) > #define CEPH_CAP_FILE_RD (CEPH_CAP_GRD << CEPH_CAP_SFILE) > #define CEPH_CAP_FILE_WR (CEPH_CAP_GWR << CEPH_CAP_SFILE) > #define CEPH_CAP_FILE_BUFFER (CEPH_CAP_GBUFFER << CEPH_CAP_SFILE) > #define CEPH_CAP_FILE_WREXTEND (CEPH_CAP_GWREXTEND << CEPH_CAP_SFILE) > #define CEPH_CAP_FILE_LAZYIO (CEPH_CAP_GLAZYIO << CEPH_CAP_SFILE) > > My questions: > > 1) Why do we have SHARED and CACHE (and similarly EXCL and BUFFER)? > Shouldn't one imply the other? Under what circumstances would you issue > them independently of one another? CACHE and BUFFER are special kinds of caps. They apply only to the FILE inode cap, and they refer to whether you can cache or buffer data extents of the file. SHARED and EXCL refer to inode attributes and apply to every kind of cap. They certainly move together often (perhaps always?), but they are distinct because CACHE and BUFFER are "extra". > 2) My understanding (quite possibly wrong) is that RD and WR are really > there cover the validity of the file layout. Should SHARED/EXCL imply > those as well? Ah, that's not right. RD and WR also apply only to File caps, and mean that you can read and write the file data from the OSDs. They *do not* map on to the same things as SHARED and EXCL do! You can easily have multiple writers who each have Fsrw caps, meaning they can read and write to the data and have shared permissions on the file metadata. They don't have exclusive caps because there are multiple active writers changing state. This means, among other things, that they can't extend the file size without an MDS request; issuing one would temporarily revoke those Fs caps but I think not the Frw ones. > 3) Is WREXTEND deprecated? The client seems to ignore it. Uh, I'm not familiar with that one, so I'm going with "yes". In fact sha1 ca6c8a7a1956691837948c38ff7c5b7c45f2a051 states "CEPH_CAP_FILE_WREXTEND is an unused bit, reuse it for CEPH_STAT_RSTAT". > 4) LAZYIO is there, but its semantics are not documented at all, AFAICT. > I get that it's supposed to relax ceph's caching semantics. Under what > circumstances _should_ the client invalidate cached dentries and inodes > when this is set? IOW, what are the lazyio "rules" ? I don't think these are very well specified, especially around dentry and inode caches. IIRC LAZYIO was created in anticipation of the proposed Linux LAZYIO extensions, but in CephFS applies mostly to cached file data to allow conflicting caches and buffers in situations where clients handle their own consistency bounds (ie, HPC applications where each client gets its own range of a file to play in and doesn't touch that of anybody else.) -Greg