Re: MDS: what do FILE_SHARED and FILE_EXCLUSIVE caps actually represent?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 12, 2019 at 9:46 AM Jeff Layton <jlayton@xxxxxxxxxxxxxxx> wrote:
>
> I have more questions about MDS caps. The File (F*) caps in cephfs are
> very granular, such that it's not clear what extra ability each one
> grants with respect to the others. Here's the list:
>
> #define CEPH_CAP_FILE_SHARED   (CEPH_CAP_GSHARED   << CEPH_CAP_SFILE)
> #define CEPH_CAP_FILE_EXCL     (CEPH_CAP_GEXCL     << CEPH_CAP_SFILE)
> #define CEPH_CAP_FILE_CACHE    (CEPH_CAP_GCACHE    << CEPH_CAP_SFILE)
> #define CEPH_CAP_FILE_RD       (CEPH_CAP_GRD       << CEPH_CAP_SFILE)
> #define CEPH_CAP_FILE_WR       (CEPH_CAP_GWR       << CEPH_CAP_SFILE)
> #define CEPH_CAP_FILE_BUFFER   (CEPH_CAP_GBUFFER   << CEPH_CAP_SFILE)
> #define CEPH_CAP_FILE_WREXTEND (CEPH_CAP_GWREXTEND << CEPH_CAP_SFILE)
> #define CEPH_CAP_FILE_LAZYIO   (CEPH_CAP_GLAZYIO   << CEPH_CAP_SFILE)
>
> My questions:
>
> 1) Why do we have SHARED and CACHE (and similarly EXCL and BUFFER)?
> Shouldn't one imply the other? Under what circumstances would you issue
> them independently of one another?

CACHE and BUFFER are special kinds of caps. They apply only to the
FILE inode cap, and they refer to whether you can cache or buffer data
extents of the file. SHARED and EXCL refer to inode attributes and
apply to every kind of cap.
They certainly move together often (perhaps always?), but they are
distinct because CACHE and BUFFER are "extra".

> 2) My understanding (quite possibly wrong) is that RD and WR are really
> there cover the validity of the file layout. Should SHARED/EXCL imply
> those as well?

Ah, that's not right. RD and WR also apply only to File caps, and mean
that you can read and write the file data from the OSDs. They *do not*
map on to the same things as SHARED and EXCL do! You can easily have
multiple writers who each have Fsrw caps, meaning they can read and
write to the data and have shared permissions on the file metadata.
They don't have exclusive caps because there are multiple active
writers changing state. This means, among other things, that they
can't extend the file size without an MDS request; issuing one would
temporarily revoke those Fs caps but I think not the Frw ones.

> 3) Is WREXTEND deprecated? The client seems to ignore it.

Uh, I'm not familiar with that one, so I'm going with "yes".
In fact sha1 ca6c8a7a1956691837948c38ff7c5b7c45f2a051 states
"CEPH_CAP_FILE_WREXTEND is an unused bit, reuse it for
CEPH_STAT_RSTAT".

> 4) LAZYIO is there, but its semantics are not documented at all, AFAICT.
> I get that it's supposed to relax ceph's caching semantics. Under what
> circumstances _should_ the client invalidate cached dentries and inodes
> when this is set? IOW, what are the lazyio "rules" ?

I don't think these are very well specified, especially around dentry
and inode caches. IIRC LAZYIO was created in anticipation of the
proposed Linux LAZYIO extensions, but in CephFS applies mostly to
cached file data to allow conflicting caches and buffers in situations
where clients handle their own consistency bounds (ie, HPC
applications where each client gets its own range of a file to play in
and doesn't touch that of anybody else.)
-Greg



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux