Re: [PATCH 2/3] mdsmap: fix mdsmap cluster available check based on laggy number

Jeff Layton <jlayton@xxxxxxxxxx> · Fri, 22 Nov 2019 08:55:23 -0500

On Fri, 2019-11-22 at 14:56 +0800, Xiubo Li wrote:
> On 2019/11/22 1:30, Jeff Layton wrote:
> > On Wed, 2019-11-20 at 03:29 -0500, xiubli@xxxxxxxxxx wrote:
> > > From: Xiubo Li <xiubli@xxxxxxxxxx>
> > > 
> > > In case the max_mds > 1 in MDS cluster and there is no any standby
> > > MDS and all the max_mds MDSs are in up:active state, if one of the
> > > up:active MDSs is dead, the m->m_num_laggy in kclient will be 1.
> > > Then the mount will fail without considering other healthy MDSs.
> > > 
> > > Only when all the MDSs in the cluster are laggy will treat the
> > > cluster as not be available.
> > > 
> > > Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
> > > ---
> > >   fs/ceph/mdsmap.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/ceph/mdsmap.c b/fs/ceph/mdsmap.c
> > > index 471bac335fae..8b4f93e5b468 100644
> > > --- a/fs/ceph/mdsmap.c
> > > +++ b/fs/ceph/mdsmap.c
> > > @@ -396,7 +396,7 @@ bool ceph_mdsmap_is_cluster_available(struct ceph_mdsmap *m)
> > >   		return false;
> > >   	if (m->m_damaged)
> > >   		return false;
> > > -	if (m->m_num_laggy > 0)
> > > +	if (m->m_num_laggy == m->m_num_mds)
> > >   		return false;
> > >   	for (i = 0; i < m->m_num_mds; i++) {
> > >   		if (m->m_info[i].state == CEPH_MDS_STATE_ACTIVE)
> > Given that laggy servers are still expected to be "in" the cluster,
> > should we just eliminate this check altogether? It seems like we'd still
> > want to allow a mount to occur even if the cluster is lagging.
> 
> For this we need one way to distinguish between mds crash and transient 
> mds laggy, for now in both cases the mds will keep staying "in" the 
> cluster and be in "up:active & laggy" state.

I would doubt there's any way to do that reliably, and in any case
detection of that state will always involve some delay.

ceph_mdsmap_is_cluster_available() is only called when mounting though.
We wouldn't want to choose a laggy server over one that isn't, but I
don't think we want to fail to mount just because all of the servers
appear to be laggy. We should consider such servers to be potentially
available but not preferred.
-- 
Jeff Layton <jlayton@xxxxxxxxxx>