Re: Multi-MDS Failover

Patrick Donnelly <pdonnell@xxxxxxxxxx> · Thu, 26 Apr 2018 17:03:06 -0700

On Thu, Apr 26, 2018 at 4:40 PM, Scottix <scottix@xxxxxxxxx> wrote:
>> Of course -- the mons can't tell the difference!
> That is really unfortunate, it would be nice to know if the filesystem has
> been degraded and to what degree.

If a rank is laggy/crashed, the file system as a whole is generally
unavailable. The span between partial outage and full is small and not
worth quantifying.

>> You must have standbys for high availability. This is the docs.
> Ok but what if you have your standby go down and a master go down. This
> could happen in the real world and is a valid error scenario.
>Also there is
> a period between when the standby becomes active what happens in-between
> that time?

The standby MDS goes through a series of states where it recovers the
lost state and connections with clients. Finally, it goes active.

>> It depends(tm) on how the metadata is distributed and what locks are
> held by each MDS.
> Your saying depending on which mds had a lock on a resource it will block
> that particular POSIX operation? Can you clarify a little bit?
>
>> Standbys are not optional in any production cluster.
> Of course in production I would hope people have standbys but in theory
> there is no enforcement in Ceph for this other than a warning. So when you
> say not optional that is not exactly true it will still run.

It's self-defeating to expect CephFS to enforce having standbys --
presumably by throwing an error or becoming unavailable -- when the
standbys exist to make the system available.

There's nothing to enforce. A warning is sufficient for the operator
that (a) they didn't configure any standbys or (b) MDS daemon
processes/boxes are going away and not coming back as standbys (i.e.
the pool of MDS daemons is decreasing with each failover)

-- 
Patrick Donnelly
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com