Re: Help with file system with failed mds daemon

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 22, 2017 at 4:58 PM, Bryan Banister
<bbanister@xxxxxxxxxxxxxxx> wrote:
> Hi all,
>
>
>
> I’m still new to ceph and cephfs.  Trying out the multi-fs configuration on
> at Luminous test cluster.  I shutdown the cluster to do an upgrade and when
> I brought the cluster back up I now have a warnings that one of the file
> systems has a failed mds daemon:
>
>
>
> 2017-08-21 17:00:00.000081 mon.carf-ceph-osd15 [WRN] overall HEALTH_WARN 1
> filesystem is degraded; 1 filesystem is have a failed mds daemon; 1 pools
> have many more objects per pg than average; application not enabled on 9
> pool(s)
>
>
>
> I tried restarting the mds service on the system and it doesn’t seem to
> indicate any problems:
>
> 2017-08-21 16:13:40.979449 7fffed8b0700  1 mds.0.20 shutdown: shutting down
> rank 0
>
> 2017-08-21 16:13:41.012167 7ffff7fde1c0  0 set uid:gid to 167:167
> (ceph:ceph)
>
> 2017-08-21 16:13:41.012180 7ffff7fde1c0  0 ceph version 12.1.4
> (a5f84b37668fc8e03165aaf5cbb380c78e4deba4) luminous (rc), process (unknown),
> pid 16656
>
> 2017-08-21 16:13:41.014105 7ffff7fde1c0  0 pidfile_write: ignore empty
> --pid-file
>
> 2017-08-21 16:13:45.541442 7ffff10b7700  1 mds.0.23 handle_mds_map i am now
> mds.0.23
>
> 2017-08-21 16:13:45.541449 7ffff10b7700  1 mds.0.23 handle_mds_map state
> change up:boot --> up:replay
>
> 2017-08-21 16:13:45.541459 7ffff10b7700  1 mds.0.23 replay_start
>
> 2017-08-21 16:13:45.541466 7ffff10b7700  1 mds.0.23  recovery set is
>
> 2017-08-21 16:13:45.541475 7ffff10b7700  1 mds.0.23  waiting for osdmap 1198
> (which blacklists prior instance)
>
> 2017-08-21 16:13:45.565779 7fffea8aa700  0 mds.0.cache creating system inode
> with ino:0x100
>
> 2017-08-21 16:13:45.565920 7fffea8aa700  0 mds.0.cache creating system inode
> with ino:0x1
>
> 2017-08-21 16:13:45.571747 7fffe98a8700  1 mds.0.23 replay_done
>
> 2017-08-21 16:13:45.571751 7fffe98a8700  1 mds.0.23 making mds journal
> writeable
>
> 2017-08-21 16:13:46.542148 7ffff10b7700  1 mds.0.23 handle_mds_map i am now
> mds.0.23
>
> 2017-08-21 16:13:46.542149 7ffff10b7700  1 mds.0.23 handle_mds_map state
> change up:replay --> up:reconnect
>
> 2017-08-21 16:13:46.542158 7ffff10b7700  1 mds.0.23 reconnect_start
>
> 2017-08-21 16:13:46.542161 7ffff10b7700  1 mds.0.23 reopen_log
>
> 2017-08-21 16:13:46.542171 7ffff10b7700  1 mds.0.23 reconnect_done
>
> 2017-08-21 16:13:47.543612 7ffff10b7700  1 mds.0.23 handle_mds_map i am now
> mds.0.23
>
> 2017-08-21 16:13:47.543616 7ffff10b7700  1 mds.0.23 handle_mds_map state
> change up:reconnect --> up:rejoin
>
> 2017-08-21 16:13:47.543623 7ffff10b7700  1 mds.0.23 rejoin_start
>
> 2017-08-21 16:13:47.543638 7ffff10b7700  1 mds.0.23 rejoin_joint_start
>
> 2017-08-21 16:13:47.543666 7ffff10b7700  1 mds.0.23 rejoin_done
>
> 2017-08-21 16:13:48.544768 7ffff10b7700  1 mds.0.23 handle_mds_map i am now
> mds.0.23
>
> 2017-08-21 16:13:48.544771 7ffff10b7700  1 mds.0.23 handle_mds_map state
> change up:rejoin --> up:active
>
> 2017-08-21 16:13:48.544779 7ffff10b7700  1 mds.0.23 recovery_done --
> successful recovery!
>
> 2017-08-21 16:13:48.544924 7ffff10b7700  1 mds.0.23 active_start
>
> 2017-08-21 16:13:48.544954 7ffff10b7700  1 mds.0.23 cluster recovered.
>
>
>
> This seems like an easy problem to fix.  Any help is greatly appreciated!

I wonder if you have two filesystems but only one MDS?  Ceph will then
think that the second filesystem "has a failed MDS" because there
isn't an MDS online to service it.

John

>
> -Bryan
>
>
> ________________________________
>
> Note: This email is for the confidential use of the named addressee(s) only
> and may contain proprietary, confidential or privileged information. If you
> are not the intended recipient, you are hereby notified that any review,
> dissemination or copying of this email is strictly prohibited, and to please
> notify the sender immediately and destroy this email and any attachments.
> Email transmission cannot be guaranteed to be secure or error-free. The
> Company, therefore, does not make any guarantees as to the completeness or
> accuracy of this email or any attachments. This email is for informational
> purposes only and does not constitute a recommendation, offer, request or
> solicitation of any kind to buy, sell, subscribe, redeem or perform any type
> of transaction of a financial product.
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux