Eugen; All services are running, yes, though they didn't all start when I brought the host up (configured not to start because the last thing I had done is physically relocate the entire cluster). All services are running, and happy. # ceph status cluster: id: 1a8a1693-fa54-4cb3-89d2-7951d4cee6a3 health: HEALTH_OK services: mon: 3 daemons, quorum S700028,S700029,S700030 (age 20h) mgr: S700028(active, since 17h), standbys: S700029, S700030 mds: cifs:1 {0=S700029=up:active} 2 up:standby osd: 6 osds: 6 up (since 21h), 6 in (since 21h) data: pools: 16 pools, 192 pgs objects: 449 objects, 761 MiB usage: 724 GiB used, 65 TiB / 66 TiB avail pgs: 192 active+clean # ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 66.17697 root default -5 22.05899 host S700029 2 hdd 11.02950 osd.2 up 1.00000 1.00000 3 hdd 11.02950 osd.3 up 1.00000 1.00000 -7 22.05899 host S700030 4 hdd 11.02950 osd.4 up 1.00000 1.00000 5 hdd 11.02950 osd.5 up 1.00000 1.00000 -3 22.05899 host s700028 0 hdd 11.02950 osd.0 up 1.00000 1.00000 1 hdd 11.02950 osd.1 up 1.00000 1.00000 The question about configuring the MDS as failover struck me as a potential, since I don't remember doing that, however it look like S700029 (10.0.200.111) took over from S700028 (10.0.200.110) as the active MDS. Thank you, Dominic L. Hilsbos, MBA Director - Information Technology Perform Air International Inc. DHilsbos@xxxxxxxxxxxxxx www.PerformAir.com -----Original Message----- From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of Eugen Block Sent: Thursday, June 27, 2019 8:23 AM To: ceph-users@xxxxxxxxxxxxxx Subject: Re: MGR Logs after Failure Testing Hi, some more information about the cluster status would be helpful, such as ceph -s ceph osd tree service status of all MONs, MDSs, MGRs. Are all services up? Did you configure the spare MDS as standby for rank 0 so that a failover can happen? Regards, Eugen Zitat von DHilsbos@xxxxxxxxxxxxxx: > All; > > I built a demonstration and testing cluster, just 3 hosts > (10.0.200.110, 111, 112). Each host runs mon, mgr, osd, mds. > > During the demonstration yesterday, I pulled the power on one of the hosts. > > After bringing the host back up, I'm getting several error messages > every second or so: > 2019-06-26 16:01:56.424 7fcbe0af9700 0 ms_deliver_dispatch: > unhandled message 0x55e80a728f00 mgrreport(mds.S700030 +0-0 packed > 6) v7 from mds.? v2:10.0.200.112:6808/980053124 > 2019-06-26 16:01:56.425 7fcbf4cd1700 1 mgr finish mon failed to > return metadata for mds.S700030: (2) No such file or directory > 2019-06-26 16:01:56.429 7fcbe0af9700 0 ms_deliver_dispatch: > unhandled message 0x55e809f8e600 mgrreport(mds.S700029 +110-0 packed > 1366) v7 from mds.0 v2:10.0.200.111:6808/2726495738 > 2019-06-26 16:01:56.430 7fcbf4cd1700 1 mgr finish mon failed to > return metadata for mds.S700029: (2) No such file or directory > > Thoughts? > > Thank you, > > Dominic L. Hilsbos, MBA > Director - Information Technology > Perform Air International Inc. > DHilsbos@xxxxxxxxxxxxxx > www.PerformAir.com > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com