On Fri, 2019-11-22 at 14:56 +0800, Xiubo Li wrote: > On 2019/11/22 1:30, Jeff Layton wrote: > > On Wed, 2019-11-20 at 03:29 -0500, xiubli@xxxxxxxxxx wrote: > > > From: Xiubo Li <xiubli@xxxxxxxxxx> > > > > > > In case the max_mds > 1 in MDS cluster and there is no any standby > > > MDS and all the max_mds MDSs are in up:active state, if one of the > > > up:active MDSs is dead, the m->m_num_laggy in kclient will be 1. > > > Then the mount will fail without considering other healthy MDSs. > > > > > > Only when all the MDSs in the cluster are laggy will treat the > > > cluster as not be available. > > > > > > Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx> > > > --- > > > fs/ceph/mdsmap.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/fs/ceph/mdsmap.c b/fs/ceph/mdsmap.c > > > index 471bac335fae..8b4f93e5b468 100644 > > > --- a/fs/ceph/mdsmap.c > > > +++ b/fs/ceph/mdsmap.c > > > @@ -396,7 +396,7 @@ bool ceph_mdsmap_is_cluster_available(struct ceph_mdsmap *m) > > > return false; > > > if (m->m_damaged) > > > return false; > > > - if (m->m_num_laggy > 0) > > > + if (m->m_num_laggy == m->m_num_mds) > > > return false; > > > for (i = 0; i < m->m_num_mds; i++) { > > > if (m->m_info[i].state == CEPH_MDS_STATE_ACTIVE) > > Given that laggy servers are still expected to be "in" the cluster, > > should we just eliminate this check altogether? It seems like we'd still > > want to allow a mount to occur even if the cluster is lagging. > > For this we need one way to distinguish between mds crash and transient > mds laggy, for now in both cases the mds will keep staying "in" the > cluster and be in "up:active & laggy" state. I would doubt there's any way to do that reliably, and in any case detection of that state will always involve some delay. ceph_mdsmap_is_cluster_available() is only called when mounting though. We wouldn't want to choose a laggy server over one that isn't, but I don't think we want to fail to mount just because all of the servers appear to be laggy. We should consider such servers to be potentially available but not preferred. -- Jeff Layton <jlayton@xxxxxxxxxx>