mds isn't working anymore after osd's running full

greg@xxxxxxxxxxx (Gregory Farnum) · Tue, 19 Aug 2014 10:49:02 -0700



On Mon, Aug 18, 2014 at 6:56 AM, Jasper Siero
<jasper.siero at target-holding.nl> wrote:
> Hi all,
>
> We have a small ceph cluster running version 0.80.1 with cephfs on five
> nodes.
> Last week some osd's were full and shut itself down. To help de osd's start
> again I added some extra osd's and moved some placement group directories on
> the full osd's (which has a copy on another osd) to another place on the
> node (as mentioned in
> http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/)
> After clearing some space on the full osd's I started them again. After a
> lot of deep scrubbing and two pg inconsistencies which needed to be repaired
> everything looked fine except the mds which still is in the replay state and
> it stays that way.
> The log below says that mds need osdmap epoch 1833 and have 1832.
>
> 2014-08-18 12:29:22.268248 7fa786182700  1 mds.-1.0 handle_mds_map standby
> 2014-08-18 12:29:22.273995 7fa786182700  1 mds.0.25 handle_mds_map i am now
> mds.0.25
> 2014-08-18 12:29:22.273998 7fa786182700  1 mds.0.25 handle_mds_map state
> change up:standby --> up:replay
> 2014-08-18 12:29:22.274000 7fa786182700  1 mds.0.25 replay_start
> 2014-08-18 12:29:22.274014 7fa786182700  1 mds.0.25  recovery set is
> 2014-08-18 12:29:22.274016 7fa786182700  1 mds.0.25  need osdmap epoch 1833,
> have 1832
> 2014-08-18 12:29:22.274017 7fa786182700  1 mds.0.25  waiting for osdmap 1833
> (which blacklists prior instance)
>
>  # ceph status
>     cluster c78209f5-55ea-4c70-8968-2231d2b05560
>      health HEALTH_WARN mds cluster is degraded
>      monmap e3: 3 mons at
> {th1-mon001=10.1.2.21:6789/0,th1-mon002=10.1.2.22:6789/0,th1-mon003=10.1.2.23:6789/0},
> election epoch 362, quorum 0,1,2 th1-mon001,th1-mon002,th1-mon003
>      mdsmap e154: 1/1/1 up {0=th1-mon001=up:replay}, 1 up:standby
>      osdmap e1951: 12 osds: 12 up, 12 in
>       pgmap v193685: 492 pgs, 4 pools, 60297 MB data, 470 kobjects
>             124 GB used, 175 GB / 299 GB avail
>                  492 active+clean
>
> # ceph osd tree
> # id    weight    type name    up/down    reweight
> -1    0.2399    root default
> -2    0.05997        host th1-osd001
> 0    0.01999            osd.0    up    1
> 1    0.01999            osd.1    up    1
> 2    0.01999            osd.2    up    1
> -3    0.05997        host th1-osd002
> 3    0.01999            osd.3    up    1
> 4    0.01999            osd.4    up    1
> 5    0.01999            osd.5    up    1
> -4    0.05997        host th1-mon003
> 6    0.01999            osd.6    up    1
> 7    0.01999            osd.7    up    1
> 8    0.01999            osd.8    up    1
> -5    0.05997        host th1-mon002
> 9    0.01999            osd.9    up    1
> 10    0.01999            osd.10    up    1
> 11    0.01999            osd.11    up    1
>
> What is the way to get the mds up and running again?
>
> I still have all the placement group directories which I moved from the full
> osds which where down to create disk space.

Try just restarting the MDS daemon. This sounds a little familiar so I
think it's a known bug which may be fixed in a later dev or point
release on the MDS, but it's a soft-state rather than a disk state
issue.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com