Hi, On 06/23/2017 02:44 PM, Bogdan SOLGA
wrote:
My experience as user so far: It was declared stable with the Jewel release. Since Jewel we did not encounter any severe problem with the MDS, it has improved a lot compared to prior releases (working with CephFS since Firefly). Active-Active setups are possible, but not recommended. You can setup additional MDS servers as standby/standby-replay servers, which become active if the current active MDS fails. There might be a delay due to MDS state detection, but it can be adopted to your use case. The main problem with failover to a standby MDS is the fact the all active inodes are stat()ed during failover; with many open files and slow storage this might take a considerable amount of time. Be sure to run some tests with a real life workload. Definitely a downgrade. Every file metadata access requires a communication with the MDS to allocate the necessary capability. It might also require the MDS to contact other clients and ask for capabilities to be released. The impact depends on your use case; many clients working in different directories might be less affected (especially due to the limited lock/capability contention), all clients working in the same directory gives a significant performance penalty. But this behavior is to be expected for any posix compliant distributed file system. Data I/O itself does not involve the MDS, so speed should be comparable to RBD. Try to use the kernel cephfs implementation if possible, since it does not require kernel/user space context switches and thus has a better performance compared to ceph-fuse. Problems (e.g. files/directories still locked by the failed client) should be temporarely, since stale sessions are detected and removed by the MDS. Regards, Burkhard |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com