On Thu, Jan 17, 2019 at 10:27:20AM -0800, Void Star Nill wrote: > Hi, > > We am trying to use Ceph in our products to address some of the use cases. > We think Ceph block device for us. One of the use cases is that we have a > number of jobs running in containers that need to have Read-Only access to > shared data. The data is written once and is consumed multiple times. I > have read through some of the similar discussions and the recommendations > on using CephFS for these situations, but in our case Block device makes > more sense as it fits well with other use cases and restrictions we have > around this use case. > > The following scenario seems to work as expected when we tried on a test > cluster, but we wanted to get an expert opinion to see if there would be > any issues in production. The usage scenario is as follows: > > - A block device is created with "--image-shared" options: > > rbd create mypool/foo --size 4G --image-shared "--image-shared" just means that the created image will have "exclusive-lock" feature and all other features that depend on it disabled. It is useful for scenarios when one wants simulteous write access to the image (e.g. when using a shared-disk cluster fs like ocfs2) and does not want a performance penalty due to "exlusive-lock" being pinged-ponged between writers. For your scenario it is not necessary but is ok. > - The image is mapped to a host, formatted in ext4 format (or other file > formats), mounted to a directory in read/write mode and data is written to > it. Please note that the image will be mapped in exclusive write mode -- no > other read/write mounts are allowed a this time. The map "exclusive" option works only for images with "exclusive-lock" feature enabled and prevent in this case automatic exclusive lock transitions (ping-pong mentioned above) from one writer to another. And in this case it will not prevent from mapping and mounting it ro and probably even rw (I am not familiar enough with kernel rbd implementation to be sure here), though in the last case the write will fail. > - The volume is unmapped from the host and then mapped on to N number of > other hosts where it will be mounted in read-only mode and the data is read > simultaneously from N readers > > As mentioned above, this seems to work as expected, but we wanted to > confirm that we won't run into any unexpected issues. It should work. Although as you can see rbd hardly protects simultaneous access in this case so it should be carefully organized on higher level. But you may consider creating a snapshot after modifying the image and mapping and mounting the snapshot on readers. This way you even can modify the image without unmounting the readers and then remap/remount the new snapshot. And you will have a rollback option as a gratis. Also there is a valid concern mentioned by others about ext4 might want to flush the journal if it is not clean even when mounting ro. I expect the mount will just fail in this case because the image is mapped ro, but you might want to investigate how to improve this. -- Mykola Golub _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com