Re: MDS does not always failover to hot standby

Gregory Farnum <gfarnum@xxxxxxxxxx> · Tue, 11 Sep 2018 10:52:28 -0700

The monitors can't replace an MDS if they're trying to agree amongst themselves. Note the repeated monitor elections happening at the same time.
A monitor election is *often* transparent to clients, since they and the OSDs only need the monitors when something happens in the cluster. But when you collide losing an MDS and losing monitors at the same time, the clients can't do their MDS requests but the monitors can't do a fast failover of the MDS because the monitors are trying to establish a quorum.

(There's also the time sync warnings going on; those may or may not be causing issues here but certainly aren't helping anything!)
-Greg

On Fri, Sep 7, 2018 at 7:24 PM Bryan Henderson <bryanh@xxxxxxxxxxxxxxxx> wrote:
> It's mds_beacon_grace.  Set that on the monitor to control the replacement of

> laggy MDS daemons,

Sounds like William's issue is something else.  William shuts down MDS 2 and

MON 4 simultaneously.  The log shows that some time later (we don't know how

long), MON 3 detects that MDS 2 is gone ("MDS_ALL_DOWN"), but does nothing

about it until 30 seconds later, which happens to be when MDS 2 and MON 4 come

back.  At that point, MON 3 reports that the rank has been reassigned to MDS

1.

'mds_beacon_grace' determines when a monitor declares MDS_ALL_DOWN, right?

I think if things are working as designed, the log should show MON 3

reassigning the rank to MDS 1 immediately after it reports MDS 2 is gone.

>From the original post:

2018-08-25 03:30:02.936554 mon.dub-sitv-ceph-03 mon.0 10.18.53.32:6789/0 55 : cluster [ERR] Health check failed: 1 filesystem is offline (MDS_ALL_DOWN)

2018-08-25 03:30:04.235703 mon.dub-sitv-ceph-05 mon.2 10.18.186.208:6789/0 226 : cluster [INF] mon.dub-sitv-ceph-05 calling monitor election

2018-08-25 03:30:04.238672 mon.dub-sitv-ceph-03 mon.0 10.18.53.32:6789/0 56 : cluster [INF] mon.dub-sitv-ceph-03 calling monitor election

2018-08-25 03:30:09.242595 mon.dub-sitv-ceph-03 mon.0 10.18.53.32:6789/0 57 : cluster [INF] mon.dub-sitv-ceph-03 is new leader, mons dub-sitv-ceph-03,dub-sitv-ceph-05 in quorum (ranks 0,2)

2018-08-25 03:30:09.252804 mon.dub-sitv-ceph-03 mon.0 10.18.53.32:6789/0 62 : cluster [WRN] Health check failed: 1/3 mons down, quorum dub-sitv-ceph-03,dub-sitv-ceph-05 (MON_DOWN)

2018-08-25 03:30:09.258693 mon.dub-sitv-ceph-03 mon.0 10.18.53.32:6789/0 63 : cluster [WRN] overall HEALTH_WARN 2 osds down; 2 hosts (2 osds) down; 1/3 mons down, quorum dub-sitv-ceph-03,dub-sitv-ceph-05

2018-08-25 03:30:10.254162 mon.dub-sitv-ceph-03 mon.0 10.18.53.32:6789/0 64 : cluster [WRN] Health check failed: Reduced data availability: 2 pgs inactive, 115 pgs peering (PG_AVAILABILITY)

2018-08-25 03:30:12.429145 mon.dub-sitv-ceph-03 mon.0 10.18.53.32:6789/0 66 : cluster [WRN] Health check failed: Degraded data redundancy: 712/2504 objects degraded (28.435%), 86 pgs degraded (PG_DEGRADED)

2018-08-25 03:30:16.137408 mon.dub-sitv-ceph-03 mon.0 10.18.53.32:6789/0 67 : cluster [WRN] Health check update: Reduced data availability: 1 pg inactive, 69 pgs peering (PG_AVAILABILITY)

2018-08-25 03:30:17.193322 mon.dub-sitv-ceph-03 mon.0 10.18.53.32:6789/0 68 : cluster [INF] Health check cleared: PG_AVAILABILITY (was: Reduced data availability: 1 pg inactive, 69 pgs peering)

2018-08-25 03:30:18.432043 mon.dub-sitv-ceph-03 mon.0 10.18.53.32:6789/0 69 : cluster [WRN] Health check update: Degraded data redundancy: 1286/2572 objects degraded (50.000%), 166 pgs degraded (PG_DEGRADED)

2018-08-25 03:30:26.139491 mon.dub-sitv-ceph-03 mon.0 10.18.53.32:6789/0 71 : cluster [WRN] Health check update: Degraded data redundancy: 1292/2584 objects degraded (50.000%), 166 pgs degraded (PG_DEGRADED)

2018-08-25 03:30:31.355321 mon.dub-sitv-ceph-04 mon.1 10.18.53.155:6789/0 1 : cluster [INF] mon.dub-sitv-ceph-04 calling monitor election

2018-08-25 03:30:31.371519 mon.dub-sitv-ceph-04 mon.1 10.18.53.155:6789/0 2 : cluster [WRN] message from mon.0 was stamped 0.817433s in the future, clocks not synchronized

2018-08-25 03:30:32.175677 mon.dub-sitv-ceph-03 mon.0 10.18.53.32:6789/0 72 : cluster [INF] mon.dub-sitv-ceph-03 calling monitor election

2018-08-25 03:30:32.175864 mon.dub-sitv-ceph-05 mon.2 10.18.186.208:6789/0 227 : cluster [INF] mon.dub-sitv-ceph-05 calling monitor election

2018-08-25 03:30:32.180615 mon.dub-sitv-ceph-03 mon.0 10.18.53.32:6789/0 73 : cluster [INF] mon.dub-sitv-ceph-03 is new leader, mons dub-sitv-ceph-03,dub-sitv-ceph-04,dub-sitv-ceph-05 in quorum (ranks 0,1,2)

2018-08-25 03:30:32.189593 mon.dub-sitv-ceph-03 mon.0 10.18.53.32:6789/0 78 : cluster [INF] Health check cleared: MON_DOWN (was: 1/3 mons down, quorum dub-sitv-ceph-03,dub-sitv-ceph-05)

2018-08-25 03:30:32.190820 mon.dub-sitv-ceph-03 mon.0 10.18.53.32:6789/0 79 : cluster [WRN] mon.1 10.18.53.155:6789/0 clock skew 0.811318s > max 0.05s

2018-08-25 03:30:32.194280 mon.dub-sitv-ceph-03 mon.0 10.18.53.32:6789/0 80 : cluster [WRN] overall HEALTH_WARN 2 osds down; 2 hosts (2 osds) down; Degraded data redundancy: 1292/2584 objects degraded (50.000%), 166 pgs degraded

2018-08-25 03:30:35.076121 mon.dub-sitv-ceph-03 mon.0 10.18.53.32:6789/0 83 : cluster [INF] daemon mds.dub-sitv-ceph-02 restarted

2018-08-25 03:30:35.270222 mon.dub-sitv-ceph-03 mon.0 10.18.53.32:6789/0 85 : cluster [WRN] Health check failed: 1 filesystem is degraded (FS_DEGRADED)

2018-08-25 03:30:35.270267 mon.dub-sitv-ceph-03 mon.0 10.18.53.32:6789/0 86 : cluster [ERR] Health check failed: 1 filesystem is offline (MDS_ALL_DOWN)

2018-08-25 03:30:35.282139 mon.dub-sitv-ceph-03 mon.0 10.18.53.32:6789/0 88 : cluster [INF] Standby daemon mds.dub-sitv-ceph-01 assigned to filesystem cephfs as rank 0

2018-08-25 03:30:35.282268 mon.dub-sitv-ceph-03 mon.0 10.18.53.32:6789/0 89 : cluster [INF] Health check cleared: MDS_ALL_DOWN (was: 1 filesystem is offline)

-- 

Bryan Henderson                                   San Jose, California

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com