Re: CephFS vs RBD

Burkhard Linke <Burkhard.Linke@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> · Fri, 23 Jun 2017 15:04:20 +0200



    Hi,
    

    On 06/23/2017 02:44 PM, Bogdan SOLGA
      wrote:

    
          Hello, everyone!

            
          We are working on a project which uses RBD images (formatted
          with XFS) as home folders for the project's users. The access
          speed and the overall reliability have been pretty good, so
          far.

          
          From the architectural perspective, our main focus is on
          providing a seamless user experience in case the Ceph clients
          will suddenly go offline. Hence, we were thinking about using
          CephFS instead of the RBD images, and we want to know your
          experiences with it.

        
    My experience as user so far:

    
        A few questions:

        
          is CephFS considered production ready? As far as I know,
            it currently supports a single (active) MDS server;
        
      
    It was declared stable with the Jewel release. Since Jewel we did
    not encounter any severe problem with the MDS, it has improved a lot
    compared to prior releases (working with CephFS since Firefly).

    
    Active-Active setups are possible, but not recommended. You can
    setup additional MDS servers as standby/standby-replay servers,
    which become active if the current active MDS fails. There might be
    a delay due to MDS state detection, but it can be adopted to your
    use case.

    
    The main problem with failover to a standby MDS is the fact the all
    active inodes are stat()ed during failover; with many open files and
    slow storage this might take a considerable amount of time. Be sure
    to run some tests with a real life workload.

    
          should we expect any speed / performance differences
            between RBD and CephFS? if yes - should we see an
            improvement or a downgrade?

          
    Definitely a downgrade. Every file metadata access requires a
    communication with the MDS to allocate the necessary capability. It
    might also require the MDS to contact other clients and ask for
    capabilities to be released.

    
    The impact depends on your use case; many clients working in
    different directories might be less affected (especially due to the
    limited lock/capability contention), all clients working in the same
    directory gives a significant performance penalty. But this behavior
    is to be expected for any posix compliant distributed file system.

    
    Data I/O itself does not involve the MDS, so speed should be
    comparable to RBD. Try to use the kernel cephfs implementation if
    possible, since it does not require kernel/user space context
    switches and thus has a better performance compared to ceph-fuse.

    
          as far as I know, if we'd use CephFS, we'd be able to
            mount the file system on several Ceph clients; would there
            be any problem (from Ceph's perspective) if one of those
            clients would suddenly go offline?
        
      
    Problems (e.g. files/directories still locked by the failed client)
    should be temporarely, since stale sessions are detected and removed
    by the MDS.

    
    Regards,

    Burkhard
  

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com