Re: cephfs ha mount expectations

mj <lists@xxxxxxxxxxxxx> · Wed, 26 Oct 2022 17:33:37 +0200

Hi all,

Thanks for the interesting discussion. Actually it's a bit disappointing 
to see that also cephfs with multiple MDS servers is not as HA as we 
would like it.

I read also that filover time depends on the number of clients. We will 
only have three, and they will not do heavy IO. So that should perhaps 
help a bit.

Is there any difference between an 'uncontrolled' ceph server 
(accidental) reboot, and a controlled reboot, where we (for example) 
first failover the MDS in a controlled, gentle way?

MJ

Op 26-10-2022 om 14:40 schreef Eugen Block:
Just one comment on the standby-replay setting: it really depends on the 
use-case, it can make things worse during failover. Just recently we had 
a customer where disabling standby-replay made failovers even faster and 
cleaner in a heavily used cluster. With standby-replay they had to 
manually clean things up in the mounted directory. So I would recommend 
to test both options.

Zitat von William Edwards <wedwards@xxxxxxxxxxxxxx>:

Op 26 okt. 2022 om 10:11 heeft mj <lists@xxxxxxxxxxxxx> het volgende 
geschreven:

Hi!

We have read https://docs.ceph.com/en/latest/man/8/mount.ceph, and 
would like to see our expectations confirmed (or denied) here. :-)

Suppose we build a three-node cluster, three monitors, three MDSs, 
etc, in order to export a cephfs to multiple client nodes.

On the (RHEL8) clients (web application servers) fstab, we will mount 
the cephfs like:

cehp1,ceph2,ceph3:/ /mnt/ha-pool/ ceph 
name=admin,secretfile=/etc/ceph/admin.secret,noatime 0 2

We expect that the RHEL clients will then be able to use (read/write) 
a shared /mnt/ha-pool directory simultaneously.

Our question: how HA can we expect this setup to be? Looking for some 
practical experience here.

Specific: Can we reboot any of the three involved ceph servers 
without the clients noticing anything? Or will there be certain 
timeouts involved, during which /mnt/ha-pool/ will appear 
unresposive, and *after* a timeout the client switches monitor node, 
and /mnt/ha-pool/ will respond again?

Monitor failovers don’t cause a noticeable disruption IIRC.

MDS failovers do. The MDS needs to replay. You can minimise the effect 
with mds_standby_replay.

Of course we hope the answer is: in such a setup, cephfs clients 
should not notice a reboot at all. :-)

All the best!

MJ
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx