Re: cephfs ha mount expectations

Robert Gallop <robert.gallop@xxxxxxxxx> · Wed, 26 Oct 2022 05:42:54 -0600

I use this very example, few more servers.  I have no outage windows for my
ceph deployments as they support several production environments.

MDS is your focus, there are many knobs, but MDS is the key to client
experience.  In my environment, MDS failover takes 30-180 seconds,
depending on how much replay and rejoin needs to take place.  During this
failover I/O on the client is paused, but not broken. If you were to do an
ls at the time of failover, it may not return for a couple min worst case.
If a file transfer is ongoing it will stop writing for this failover time,
but both will complete after failover.

If I have MDs issues and failover for whatever reason is > 5 min, my
clients are lost.  I must reboot all clients tied to that MDS to recover
due to thousands of open files in various states.  This is obviously major
impact, and as we learn ceph happens less frequently, and only 3 times in
the first year of operation.

It’s awesome tech, and I look forward to future enhancements in general.

On Wed, Oct 26, 2022 at 3:41 AM William Edwards <wedwards@xxxxxxxxxxxxxx>
wrote:

>
> > Op 26 okt. 2022 om 10:11 heeft mj <lists@xxxxxxxxxxxxx> het volgende
> geschreven:
> >
> > Hi!
> >
> > We have read https://docs.ceph.com/en/latest/man/8/mount.ceph, and
> would like to see our expectations confirmed (or denied) here. :-)
> >
> > Suppose we build a three-node cluster, three monitors, three MDSs, etc,
> in order to export a cephfs to multiple client nodes.
> >
> > On the (RHEL8) clients (web application servers) fstab, we will mount
> the cephfs like:
> >
> >> cehp1,ceph2,ceph3:/ /mnt/ha-pool/ ceph
> name=admin,secretfile=/etc/ceph/admin.secret,noatime 0 2
> >
> > We expect that the RHEL clients will then be able to use (read/write) a
> shared /mnt/ha-pool directory simultaneously.
> >
> > Our question: how HA can we expect this setup to be? Looking for some
> practical experience here.
> >
> > Specific: Can we reboot any of the three involved ceph servers without
> the clients noticing anything? Or will there be certain timeouts involved,
> during which /mnt/ha-pool/ will appear unresposive, and *after* a timeout
> the client switches monitor node, and /mnt/ha-pool/ will respond again?
>
> Monitor failovers don’t cause a noticeable disruption IIRC.
>
> MDS failovers do. The MDS needs to replay. You can minimise the effect
> with mds_standby_replay.
>
> >
> > Of course we hope the answer is: in such a setup, cephfs clients should
> not notice a reboot at all. :-)
> >
> > All the best!
> >
> > MJ
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx