On Tue, Aug 22, 2017 at 8:49 PM, Bryan Banister <bbanister@xxxxxxxxxxxxxxx> wrote: > Hi John, > > > > Seems like you're right... strange that it seemed to work with only one mds > before I shut the cluster down. Here is the `ceph fs get` output for the > two file systems: > > > > [root@carf-ceph-osd15 ~]# ceph fs get carf_ceph_kube01 > > Filesystem 'carf_ceph_kube01' (2) > > fs_name carf_ceph_kube01 > > epoch 22 > > flags 8 > > created 2017-08-21 12:10:57.948579 > > modified 2017-08-21 12:10:57.948579 > > tableserver 0 > > root 0 > > session_timeout 60 > > session_autoclose 300 > > max_file_size 1099511627776 > > last_failure 0 > > last_failure_osd_epoch 1218 > > compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable > ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds > uses versioned encoding,6=dirfrag is stored in omap,8=file layout v2} > > max_mds 1 > > in 0 > > up {} > > failed 0 > > damaged > > stopped > > data_pools [23] > > metadata_pool 24 > > inline_data disabled > > balancer > > standby_count_wanted 0 > > [root@carf-ceph-osd15 ~]# > > [root@carf-ceph-osd15 ~]# ceph fs get carf_ceph02 > > Filesystem 'carf_ceph02' (1) > > fs_name carf_ceph02 > > epoch 26 > > flags 8 > > created 2017-08-18 14:20:50.152054 > > modified 2017-08-18 14:20:50.152054 > > tableserver 0 > > root 0 > > session_timeout 60 > > session_autoclose 300 > > max_file_size 1099511627776 > > last_failure 0 > > last_failure_osd_epoch 1198 > > compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable > ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds > uses versioned encoding,6=dirfrag is stored in omap,8=file layout v2} > > max_mds 1 > > in 0 > > up {0=474299} > > failed > > damaged > > stopped > > data_pools [21] > > metadata_pool 22 > > inline_data disabled > > balancer > > standby_count_wanted 0 > > 474299: 7.128.13.69:6800/304042158 'carf-ceph-osd15' mds.0.23 up:active seq > 5 In that instance, it's not complaining because one of the filesystems has never had an MDS. > I also looked into trying to specify the mds_namespace option to the mount > operation (http://docs.ceph.com/docs/master/cephfs/kernel/) but that doesn’t > seem to be valid: > > [ceph-admin@carf-ceph-osd04 ~]$ sudo mount -t ceph carf-ceph-osd15:6789:/ > /mnt/carf_ceph02/ -o > mds_namespace=carf_ceph02,name=cephfs.k8test,secretfile=k8test.secret > > mount error 22 = Invalid argument It's likely that you are using an older kernel that doesn't have support for the feature. It was added in linux 4.8. John > > > Thanks, > > -Bryan > > > > -----Original Message----- > From: John Spray [mailto:jspray@xxxxxxxxxx] > Sent: Tuesday, August 22, 2017 11:18 AM > To: Bryan Banister <bbanister@xxxxxxxxxxxxxxx> > Cc: ceph-users@xxxxxxxxxxxxxx > Subject: Re: Help with file system with failed mds daemon > > > > Note: External Email > > ------------------------------------------------- > > > > On Tue, Aug 22, 2017 at 4:58 PM, Bryan Banister > > <bbanister@xxxxxxxxxxxxxxx> wrote: > >> Hi all, > >> > >> > >> > >> I’m still new to ceph and cephfs. Trying out the multi-fs configuration >> on > >> at Luminous test cluster. I shutdown the cluster to do an upgrade and >> when > >> I brought the cluster back up I now have a warnings that one of the file > >> systems has a failed mds daemon: > >> > >> > >> > >> 2017-08-21 17:00:00.000081 mon.carf-ceph-osd15 [WRN] overall HEALTH_WARN 1 > >> filesystem is degraded; 1 filesystem is have a failed mds daemon; 1 pools > >> have many more objects per pg than average; application not enabled on 9 > >> pool(s) > >> > >> > >> > >> I tried restarting the mds service on the system and it doesn’t seem to > >> indicate any problems: > >> > >> 2017-08-21 16:13:40.979449 7fffed8b0700 1 mds.0.20 shutdown: shutting >> down > >> rank 0 > >> > >> 2017-08-21 16:13:41.012167 7ffff7fde1c0 0 set uid:gid to 167:167 > >> (ceph:ceph) > >> > >> 2017-08-21 16:13:41.012180 7ffff7fde1c0 0 ceph version 12.1.4 > >> (a5f84b37668fc8e03165aaf5cbb380c78e4deba4) luminous (rc), process >> (unknown), > >> pid 16656 > >> > >> 2017-08-21 16:13:41.014105 7ffff7fde1c0 0 pidfile_write: ignore empty > >> --pid-file > >> > >> 2017-08-21 16:13:45.541442 7ffff10b7700 1 mds.0.23 handle_mds_map i am >> now > >> mds.0.23 > >> > >> 2017-08-21 16:13:45.541449 7ffff10b7700 1 mds.0.23 handle_mds_map state > >> change up:boot --> up:replay > >> > >> 2017-08-21 16:13:45.541459 7ffff10b7700 1 mds.0.23 replay_start > >> > >> 2017-08-21 16:13:45.541466 7ffff10b7700 1 mds.0.23 recovery set is > >> > >> 2017-08-21 16:13:45.541475 7ffff10b7700 1 mds.0.23 waiting for osdmap >> 1198 > >> (which blacklists prior instance) > >> > >> 2017-08-21 16:13:45.565779 7fffea8aa700 0 mds.0.cache creating system >> inode > >> with ino:0x100 > >> > >> 2017-08-21 16:13:45.565920 7fffea8aa700 0 mds.0.cache creating system >> inode > >> with ino:0x1 > >> > >> 2017-08-21 16:13:45.571747 7fffe98a8700 1 mds.0.23 replay_done > >> > >> 2017-08-21 16:13:45.571751 7fffe98a8700 1 mds.0.23 making mds journal > >> writeable > >> > >> 2017-08-21 16:13:46.542148 7ffff10b7700 1 mds.0.23 handle_mds_map i am >> now > >> mds.0.23 > >> > >> 2017-08-21 16:13:46.542149 7ffff10b7700 1 mds.0.23 handle_mds_map state > >> change up:replay --> up:reconnect > >> > >> 2017-08-21 16:13:46.542158 7ffff10b7700 1 mds.0.23 reconnect_start > >> > >> 2017-08-21 16:13:46.542161 7ffff10b7700 1 mds.0.23 reopen_log > >> > >> 2017-08-21 16:13:46.542171 7ffff10b7700 1 mds.0.23 reconnect_done > >> > >> 2017-08-21 16:13:47.543612 7ffff10b7700 1 mds.0.23 handle_mds_map i am >> now > >> mds.0.23 > >> > >> 2017-08-21 16:13:47.543616 7ffff10b7700 1 mds.0.23 handle_mds_map state > >> change up:reconnect --> up:rejoin > >> > >> 2017-08-21 16:13:47.543623 7ffff10b7700 1 mds.0.23 rejoin_start > >> > >> 2017-08-21 16:13:47.543638 7ffff10b7700 1 mds.0.23 rejoin_joint_start > >> > >> 2017-08-21 16:13:47.543666 7ffff10b7700 1 mds.0.23 rejoin_done > >> > >> 2017-08-21 16:13:48.544768 7ffff10b7700 1 mds.0.23 handle_mds_map i am >> now > >> mds.0.23 > >> > >> 2017-08-21 16:13:48.544771 7ffff10b7700 1 mds.0.23 handle_mds_map state > >> change up:rejoin --> up:active > >> > >> 2017-08-21 16:13:48.544779 7ffff10b7700 1 mds.0.23 recovery_done -- > >> successful recovery! > >> > >> 2017-08-21 16:13:48.544924 7ffff10b7700 1 mds.0.23 active_start > >> > >> 2017-08-21 16:13:48.544954 7ffff10b7700 1 mds.0.23 cluster recovered. > >> > >> > >> > >> This seems like an easy problem to fix. Any help is greatly appreciated! > > > > I wonder if you have two filesystems but only one MDS? Ceph will then > > think that the second filesystem "has a failed MDS" because there > > isn't an MDS online to service it. > > > > John > > > >> > >> -Bryan > >> > >> > >> ________________________________ > >> > >> Note: This email is for the confidential use of the named addressee(s) >> only > >> and may contain proprietary, confidential or privileged information. If >> you > >> are not the intended recipient, you are hereby notified that any review, > >> dissemination or copying of this email is strictly prohibited, and to >> please > >> notify the sender immediately and destroy this email and any attachments. > >> Email transmission cannot be guaranteed to be secure or error-free. The > >> Company, therefore, does not make any guarantees as to the completeness or > >> accuracy of this email or any attachments. This email is for informational > >> purposes only and does not constitute a recommendation, offer, request or > >> solicitation of any kind to buy, sell, subscribe, redeem or perform any >> type > >> of transaction of a financial product. > >> > >> _______________________________________________ > >> ceph-users mailing list > >> ceph-users@xxxxxxxxxxxxxx > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > > > ________________________________ > > Note: This email is for the confidential use of the named addressee(s) only > and may contain proprietary, confidential or privileged information. If you > are not the intended recipient, you are hereby notified that any review, > dissemination or copying of this email is strictly prohibited, and to please > notify the sender immediately and destroy this email and any attachments. > Email transmission cannot be guaranteed to be secure or error-free. The > Company, therefore, does not make any guarantees as to the completeness or > accuracy of this email or any attachments. This email is for informational > purposes only and does not constitute a recommendation, offer, request or > solicitation of any kind to buy, sell, subscribe, redeem or perform any type > of transaction of a financial product. _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com