On Fri, Jan 18, 2019 at 11:25 AM Mykola Golub <to.my.trociny@xxxxxxxxx> wrote: > > On Thu, Jan 17, 2019 at 10:27:20AM -0800, Void Star Nill wrote: > > Hi, > > > > We am trying to use Ceph in our products to address some of the use cases. > > We think Ceph block device for us. One of the use cases is that we have a > > number of jobs running in containers that need to have Read-Only access to > > shared data. The data is written once and is consumed multiple times. I > > have read through some of the similar discussions and the recommendations > > on using CephFS for these situations, but in our case Block device makes > > more sense as it fits well with other use cases and restrictions we have > > around this use case. > > > > The following scenario seems to work as expected when we tried on a test > > cluster, but we wanted to get an expert opinion to see if there would be > > any issues in production. The usage scenario is as follows: > > > > - A block device is created with "--image-shared" options: > > > > rbd create mypool/foo --size 4G --image-shared > > "--image-shared" just means that the created image will have > "exclusive-lock" feature and all other features that depend on it > disabled. It is useful for scenarios when one wants simulteous write > access to the image (e.g. when using a shared-disk cluster fs like > ocfs2) and does not want a performance penalty due to "exlusive-lock" > being pinged-ponged between writers. > > For your scenario it is not necessary but is ok. > > > - The image is mapped to a host, formatted in ext4 format (or other file > > formats), mounted to a directory in read/write mode and data is written to > > it. Please note that the image will be mapped in exclusive write mode -- no > > other read/write mounts are allowed a this time. > > The map "exclusive" option works only for images with "exclusive-lock" > feature enabled and prevent in this case automatic exclusive lock > transitions (ping-pong mentioned above) from one writer to > another. And in this case it will not prevent from mapping and > mounting it ro and probably even rw (I am not familiar enough with > kernel rbd implementation to be sure here), though in the last case > the write will fail. With -o exclusive, in addition to preventing automatic lock transitions, the kernel will attempt to acquire the lock at map time (i.e. before allowing any I/O) and return an error from "rbd map" in case the lock cannot be acquired. However, the fact the image is mapped -o exclusive on one host doesn't mean that it can't be mapped without -o exclusive on another host. If you then try to write though the non-exclusive mapping, the write will block until the exclusive mapping goes away resulting a hung tasks in uninterruptible sleep state -- a much less pleasant failure mode. So make sure that all writers use -o exclusive. Thanks, Ilya _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com