Re: stale status from monitor?

John Spray <jspray@xxxxxxxxxx> · Wed, 9 May 2018 10:32:58 +0100

On Tue, May 8, 2018 at 9:50 PM, Bryan Henderson <bryanh@xxxxxxxxxxxxxxxx> wrote:
> My cluster got stuck somehow, and at one point in trying to recycle things to
> unstick it, I ended up shutting down everything, then bringing up just the
> monitors.  At that point, the cluster reported the status below.
>
> With nothing but the monitors running, I don't see how the status can say
> there are two OSDs and an MDS up and requests are blocked.  This was the
> status of the cluster when I previously shut down the monitors (which I
> probably shouldn't have done when there were still OSDs and MDSs up, but I
> did).

The mon learns about down OSDs from other OSD peers.  That means that
the last OSD or two to go down will have nobody to report their
absence, so they still appear up.  I thought we recently changed
something to add a special case for that, but perhaps not.

With the MDS, the monitor can't immediately distinguish between a
daemon that is really gone, vs. one that is slow responding.  If
another standby MDS is available, then the monitor will give up on the
laggy MDS eventually and have the standby take over.  However, if no
standbys are available, then there is nothing useful the monitor can
do about a down MDS, so it just takes the optimistic view that the MDS
is still running, and will report back eventually.

All that is the "how", not necessarily a claim that the resulting
health report is ideal!

John

> It stayed that way for about 20 minutes, and I finally brought up the OSDs and
> everything went back to normal.
>
> So my question is:  Is this normal and what has to happen for the status to be
> current?
>
>     cluster 23352cdb-18fc-4efc-9d54-e72c000abfdb
>      health HEALTH_WARN
>             60 pgs peering
>             60 pgs stuck inactive
>             60 pgs stuck unclean
>             4 requests are blocked > 32 sec
>             mds cluster is degraded
>             mds a is laggy
>      monmap e3: 3 mons at {a=192.168.1.16:6789/0,b=192.168.1.23:6789/0,c=192.168.1.20:6789/0}
>             election epoch 202, quorum 0,1,2 a,c,b
>      mdsmap e315: 1/1/1 up {0=a=up:replay(laggy or crashed)}
>      osdmap e495: 2 osds: 2 up, 2 in
>      pgmap v33881: 160 pgs, 4 pools, 568 MB data, 14851 objects
>            1430 MB used, 43704 MB / 45134 MB avail
>                 100 active+clean
>                  60 peering
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com