Re: Disabling POSIX locking semantics for CephFS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 03.05.2016 18:39, Gregory Farnum wrote:
On Tue, May 3, 2016 at 9:30 AM, Burkhard Linke
<Burkhard.Linke@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
Hi,

we have a number of legacy applications that do not cope well with the POSIX
locking semantics in CephFS due to missing locking support (e.g. flock
syscalls). We are able to fix some of these applications, but others are
binary only.

Is it possible to disable POSIX locking completely in CephFS (either kernel
client or ceph-fuse)?
I'm confused. CephFS supports all of these — although some versions of
FUSE don't; you need a new-ish kernel.

So are you saying that
1) in your setup, it doesn't support both fcntl and flock,
2) that some of your applications don't do well under that scenario?

I don't really see how it's safe for you to just disable the
underlying file locking in an application which depends on it. You may
need to upgrade enough that all file locks are supported.

The application in question does a binary search in a large data file (~75 GB), which is stored on CephFS. It uses open and mmap without any further locking controls (neither fcntl nor flock). The performance was very poor with CephFS (Ubuntu Trusty 4.4 backport kernel from Xenial and ceph-fuse) compared to the same application with a NFS based storage. I didn't had the time to dig further into the kernel implementation yet, but I assume that the root cause is locking pages accessed via the memory mapped file. Adding a simple flock syscall for marking the data file globally as shared solved the problem for us, reducing the overall runtime from nearly 2 hours to 5 minutes (and thus comparable to the NFS control case). The application runs on our HPC cluster, so several 100 instances may access the same data file at once.

We have other applications that were written without locking support and that do not perform very well with CephFS. There was a thread in February with a short discussion about CephFS mmap performance (http://article.gmane.org/gmane.comp.file-systems.ceph.user/27501). As pointed out in that thread, the problem is not only related to mmap itself, but also to the need to implement a proper invalidation. We cannot fix this for all our applications due to the lack of man power and the lack of source code in some cases. We either have to find a way to make them work with CephFS, or use a different setup, e.g. an extra NFS based mount point with a re-export of CephFS. I would like to avoid the later solution...

Disabling the POSIX semantics and having a fallback to a more NFS-like semantic without guarantees is a setback, but probably the easier way (if it is possible at all). Most data accessed by these applications is read only, so complex locking is not necessary in these cases.

Regards,
Burkhard
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux