from ceph -w [root@k8s-node-01 /]# ceph -w cluster: id: 571d4bfe-2c5d-45ca-8da1-91dcaf69942c health: HEALTH_WARN 1 filesystem is degraded services: mon: 3 daemons, quorum k8s-node-00,k8s-node-01,k8s-node-02 mgr: k8s-node-01(active) mds: cephfs-1/1/1 up recovery-fs-1/1/1 up {[cephfs:0]=k8s-node-00=up:active,[recovery-fs:0]=k8s-node-02=up:rejoin}, 1 up:standby osd: 6 osds: 6 up, 6 in rgw: 2 daemons active data: pools: 13 pools, 408 pgs objects: 745.2 k objects, 1.8 TiB usage: 3.6 TiB used, 8.7 TiB / 12 TiB avail pgs: 408 active+clean io: client: 3.3 KiB/s rd, 40 KiB/s wr, 1 op/s rd, 0 op/s wr 2019-11-05 09:43:20.168699 mon.k8s-node-00 [INF] daemon mds.k8s-node-02 restarted 2019-11-05 09:43:21.248009 mon.k8s-node-00 [ERR] Health check failed: 1 filesystem is offline (MDS_ALL_DOWN) 2019-11-05 09:43:21.348459 mon.k8s-node-00 [INF] Standby daemon mds.k8s-node-02 assigned to filesystem recovery-fs as rank 0 2019-11-05 09:43:21.349477 mon.k8s-node-00 [INF] Health check cleared: MDS_ALL_DOWN (was: 1 filesystem is offline) 2019-11-05 09:43:32.129516 mon.k8s-node-00 [INF] daemon mds.k8s-node-02 restarted 2019-11-05 09:43:32.171052 mon.k8s-node-00 [ERR] Health check failed: 1 filesystem is offline (MDS_ALL_DOWN) 2019-11-05 09:43:32.237675 mon.k8s-node-00 [INF] Standby daemon mds.k8s-node-02 assigned to filesystem recovery-fs as rank 0 2019-11-05 09:43:32.238331 mon.k8s-node-00 [INF] Health check cleared: MDS_ALL_DOWN (was: 1 filesystem is offline) 2019-11-05 09:43:46.825739 mon.k8s-node-00 [INF] daemon mds.k8s-node-02 restarted 2019-11-05 09:43:47.780044 mon.k8s-node-00 [ERR] Health check failed: 1 filesystem is offline (MDS_ALL_DOWN) 2019-11-05 09:43:47.821611 mon.k8s-node-00 [INF] Standby daemon mds.k8s-node-02 assigned to filesystem recovery-fs as rank 0 2019-11-05 09:43:47.821976 mon.k8s-node-00 [INF] Health check cleared: MDS_ALL_DOWN (was: 1 filesystem is offline) ... ... -----Original message----- From: Karsten Nielsen <karsten@xxxxxxxxxx> Sent: Tue 05-11-2019 10:29 Subject: mds crash loop To: ceph-users@xxxxxxx; > Hi, > > Last week I upgraded my ceph cluster from luminus to mimic 13.2.6 > It was running fine for a while but yesterday my mds went into a crash loop. > > I have 1 active and 1 standby mds for my cephfs both of which is running the > same crash loop. > I am running ceph based on https://hub.docker.com/r/ceph/daemon version > v3.2.7-stable-3.2-minic-centos-7-x86_64 with a etcd kv store. > > Log details are: https://paste.debian.net/1113943/ > > Thanks for any hints > - Karsten > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx