Hi,
On 03.05.2016 18:39, Gregory Farnum wrote:
On Tue, May 3, 2016 at 9:30 AM, Burkhard Linke
<Burkhard.Linke@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
Hi,
we have a number of legacy applications that do not cope well with the POSIX
locking semantics in CephFS due to missing locking support (e.g. flock
syscalls). We are able to fix some of these applications, but others are
binary only.
Is it possible to disable POSIX locking completely in CephFS (either kernel
client or ceph-fuse)?
I'm confused. CephFS supports all of these — although some versions of
FUSE don't; you need a new-ish kernel.
So are you saying that
1) in your setup, it doesn't support both fcntl and flock,
2) that some of your applications don't do well under that scenario?
I don't really see how it's safe for you to just disable the
underlying file locking in an application which depends on it. You may
need to upgrade enough that all file locks are supported.
The application in question does a binary search in a large data file
(~75 GB), which is stored on CephFS. It uses open and mmap without any
further locking controls (neither fcntl nor flock). The performance was
very poor with CephFS (Ubuntu Trusty 4.4 backport kernel from Xenial and
ceph-fuse) compared to the same application with a NFS based storage. I
didn't had the time to dig further into the kernel implementation yet,
but I assume that the root cause is locking pages accessed via the
memory mapped file. Adding a simple flock syscall for marking the data
file globally as shared solved the problem for us, reducing the overall
runtime from nearly 2 hours to 5 minutes (and thus comparable to the NFS
control case). The application runs on our HPC cluster, so several 100
instances may access the same data file at once.
We have other applications that were written without locking support and
that do not perform very well with CephFS. There was a thread in
February with a short discussion about CephFS mmap performance
(http://article.gmane.org/gmane.comp.file-systems.ceph.user/27501). As
pointed out in that thread, the problem is not only related to mmap
itself, but also to the need to implement a proper invalidation. We
cannot fix this for all our applications due to the lack of man power
and the lack of source code in some cases. We either have to find a way
to make them work with CephFS, or use a different setup, e.g. an extra
NFS based mount point with a re-export of CephFS. I would like to avoid
the later solution...
Disabling the POSIX semantics and having a fallback to a more NFS-like
semantic without guarantees is a setback, but probably the easier way
(if it is possible at all). Most data accessed by these applications is
read only, so complex locking is not necessary in these cases.
Regards,
Burkhard
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com