Re: Disabling POSIX locking semantics for CephFS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 4, 2016 at 3:39 AM, Burkhard Linke
<Burkhard.Linke@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> Hi,
>
> On 03.05.2016 18:39, Gregory Farnum wrote:
>>
>> On Tue, May 3, 2016 at 9:30 AM, Burkhard Linke
>> <Burkhard.Linke@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
>>>
>>> Hi,
>>>
>>> we have a number of legacy applications that do not cope well with the
>>> POSIX
>>> locking semantics in CephFS due to missing locking support (e.g. flock
>>> syscalls). We are able to fix some of these applications, but others are
>>> binary only.
>>>
>>> Is it possible to disable POSIX locking completely in CephFS (either
>>> kernel
>>> client or ceph-fuse)?
>>
>> I'm confused. CephFS supports all of these — although some versions of
>> FUSE don't; you need a new-ish kernel.
>>
>> So are you saying that
>> 1) in your setup, it doesn't support both fcntl and flock,
>> 2) that some of your applications don't do well under that scenario?
>>
>> I don't really see how it's safe for you to just disable the
>> underlying file locking in an application which depends on it. You may
>> need to upgrade enough that all file locks are supported.
>
>
> The application in question does a binary search in a large data file (~75
> GB), which is stored on CephFS. It uses open and mmap without any further
> locking controls (neither fcntl nor flock). The performance was very poor
> with CephFS (Ubuntu Trusty 4.4 backport kernel from Xenial and ceph-fuse)
> compared to the same application with a NFS based storage. I didn't had the
> time to dig further into the kernel implementation yet, but I assume that
> the root cause is locking pages accessed via the memory mapped file. Adding
> a simple flock syscall for marking the data file globally as shared solved
> the problem for us, reducing the overall runtime from nearly 2 hours to 5
> minutes (and thus comparable to the NFS control case). The application runs
> on our HPC cluster, so several 100 instances may access the same data file
> at once.
>
> We have other applications that were written without locking support and
> that do not perform very well with CephFS. There was a thread in February
> with a short discussion about CephFS mmap performance
> (http://article.gmane.org/gmane.comp.file-systems.ceph.user/27501). As
> pointed out in that thread, the problem is not only related to mmap itself,
> but also to the need to implement a proper invalidation. We cannot fix this
> for all our applications due to the lack of man power and the lack of source
> code in some cases. We either have to find a way to make them work with
> CephFS, or use a different setup, e.g. an extra NFS based mount point with a
> re-export of CephFS. I would like to avoid the later solution...
>
> Disabling the POSIX semantics and having a fallback to a more NFS-like
> semantic without guarantees is a setback, but probably the easier way (if it
> is possible at all). Most data accessed by these applications is read only,
> so complex locking is not necessary in these cases.


see http://tracker.ceph.com/issues/15502. Maybe it's related to this issue.

Regards
Yan, Zheng

>
> Regards,
> Burkhard
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux