On Wed, May 4, 2016 at 3:39 AM, Burkhard Linke <Burkhard.Linke@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote: > Hi, > > On 03.05.2016 18:39, Gregory Farnum wrote: >> >> On Tue, May 3, 2016 at 9:30 AM, Burkhard Linke >> <Burkhard.Linke@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote: >>> >>> Hi, >>> >>> we have a number of legacy applications that do not cope well with the >>> POSIX >>> locking semantics in CephFS due to missing locking support (e.g. flock >>> syscalls). We are able to fix some of these applications, but others are >>> binary only. >>> >>> Is it possible to disable POSIX locking completely in CephFS (either >>> kernel >>> client or ceph-fuse)? >> >> I'm confused. CephFS supports all of these — although some versions of >> FUSE don't; you need a new-ish kernel. >> >> So are you saying that >> 1) in your setup, it doesn't support both fcntl and flock, >> 2) that some of your applications don't do well under that scenario? >> >> I don't really see how it's safe for you to just disable the >> underlying file locking in an application which depends on it. You may >> need to upgrade enough that all file locks are supported. > > > The application in question does a binary search in a large data file (~75 > GB), which is stored on CephFS. It uses open and mmap without any further > locking controls (neither fcntl nor flock). The performance was very poor > with CephFS (Ubuntu Trusty 4.4 backport kernel from Xenial and ceph-fuse) > compared to the same application with a NFS based storage. I didn't had the > time to dig further into the kernel implementation yet, but I assume that > the root cause is locking pages accessed via the memory mapped file. Adding > a simple flock syscall for marking the data file globally as shared solved > the problem for us, reducing the overall runtime from nearly 2 hours to 5 > minutes (and thus comparable to the NFS control case). The application runs > on our HPC cluster, so several 100 instances may access the same data file > at once. > > We have other applications that were written without locking support and > that do not perform very well with CephFS. There was a thread in February > with a short discussion about CephFS mmap performance > (http://article.gmane.org/gmane.comp.file-systems.ceph.user/27501). As > pointed out in that thread, the problem is not only related to mmap itself, > but also to the need to implement a proper invalidation. We cannot fix this > for all our applications due to the lack of man power and the lack of source > code in some cases. We either have to find a way to make them work with > CephFS, or use a different setup, e.g. an extra NFS based mount point with a > re-export of CephFS. I would like to avoid the later solution... > > Disabling the POSIX semantics and having a fallback to a more NFS-like > semantic without guarantees is a setback, but probably the easier way (if it > is possible at all). Most data accessed by these applications is read only, > so complex locking is not necessary in these cases. see http://tracker.ceph.com/issues/15502. Maybe it's related to this issue. Regards Yan, Zheng > > Regards, > Burkhard > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com