ceph mds unable to start with 0.85

greg@xxxxxxxxxxx (Gregory Farnum) · Thu, 18 Sep 2014 11:33:53 -0700

On Wed, Sep 17, 2014 at 9:59 PM, ??? <Derek at f-club.cn> wrote:
> dear,
>      my ceph cluster worked for about two weeks,  mds crashed every 2-3
> days,
> Now it stuck on replay , looks like replay crash and restart mds process
> again
>  what can i do for this?
>
>  1015 => # ceph -s
> cluster 07df7765-c2e7-44de-9bb3-0b13f6517b18
> health HEALTH_ERR 56 pgs inconsistent; 56 scrub errors; mds cluster is
> degraded; noscrub,nodeep-scrub flag(s) set
> monmap e1: 2 mons at
> {storage-1-213=10.1.0.213:6789/0,storage-1-214=10.1.0.214:6789/0}, election
> epoch 26, quorum 0,1 storage-1-213,storage-1-214
> mdsmap e624: 1/1/1 up {0=storage-1-214=up:replay}, 1 up:standby
> osdmap e1932: 18 osds: 18 up, 18 in
> flags noscrub,nodeep-scrub
> pgmap v732381: 500 pgs, 3 pools, 2155 GB data, 39187 kobjects
> 4479 GB used, 32292 GB / 36772 GB avail
> 444 active+clean
> 56 active+clean+inconsistent
> client io 125 MB/s rd, 31 op/s
>
> MDS log here:
>
> 014-09-18 12:36:10.684841 7f8240512700 5 mds.-1.-1 handle_mds_map epoch 620
> from mon.0
> 2014-09-18 12:36:10.684888 7f8240512700 1 mds.-1.0 handle_mds_map standby
> 2014-09-18 12:38:55.584370 7f8240512700 5 mds.-1.0 handle_mds_map epoch 621
> from mon.0
> 2014-09-18 12:38:55.584432 7f8240512700 1 mds.0.272 handle_mds_map i am now
> mds.0.272
> 2014-09-18 12:38:55.584436 7f8240512700 1 mds.0.272 handle_mds_map state
> change up:standby --> up:replay
> 2014-09-18 12:38:55.584440 7f8240512700 1 mds.0.272 replay_start
> 2014-09-18 12:38:55.584456 7f8240512700 7 mds.0.cache set_recovery_set
> 2014-09-18 12:38:55.584460 7f8240512700 1 mds.0.272 recovery set is
> 2014-09-18 12:38:55.584464 7f8240512700 1 mds.0.272 need osdmap epoch 1929,
> have 1927
> 2014-09-18 12:38:55.584467 7f8240512700 1 mds.0.272 waiting for osdmap 1929
> (which blacklists prior instance)
> 2014-09-18 12:38:55.584523 7f8240512700 5 mds.0.272 handle_mds_failure for
> myself; not doing anything
> 2014-09-18 12:38:55.585662 7f8240512700 2 mds.0.272 boot_start 0: opening
> inotable
> 2014-09-18 12:38:55.585864 7f8240512700 2 mds.0.272 boot_start 0: opening
> sessionmap
> 2014-09-18 12:38:55.586003 7f8240512700 2 mds.0.272 boot_start 0: opening
> mds log
> 2014-09-18 12:38:55.586049 7f8240512700 5 mds.0.log open discovering log
> bounds
> 2014-09-18 12:38:55.586136 7f8240512700 2 mds.0.272 boot_start 0: opening
> snap table
> 2014-09-18 12:38:55.586984 7f8240512700 5 mds.0.272 ms_handle_connect on
> 10.1.0.213:6806/6114
> 2014-09-18 12:38:55.587037 7f8240512700 5 mds.0.272 ms_handle_connect on
> 10.1.0.213:6811/6385
> 2014-09-18 12:38:55.587285 7f8240512700 5 mds.0.272 ms_handle_connect on
> 10.1.0.213:6801/6110
> 2014-09-18 12:38:55.591700 7f823ca08700 4 mds.0.log Waiting for journal 200
> to recover...
> 2014-09-18 12:38:55.593297 7f8240512700 5 mds.0.272 ms_handle_connect on
> 10.1.0.214:6806/6238
> 2014-09-18 12:38:55.600952 7f823ca08700 4 mds.0.log Journal 200 recovered.
> 2014-09-18 12:38:55.600967 7f823ca08700 4 mds.0.log Recovered journal 200 in
> format 1
> 2014-09-18 12:38:55.600973 7f823ca08700 2 mds.0.272 boot_start 1:
> loading/discovering base inodes
> 2014-09-18 12:38:55.600979 7f823ca08700 0 mds.0.cache creating system inode
> with ino:100
> 2014-09-18 12:38:55.601279 7f823ca08700 0 mds.0.cache creating system inode
> with ino:1
> 2014-09-18 12:38:55.602557 7f8240512700 5 mds.0.272 ms_handle_connect on
> 10.1.0.214:6811/6276
> 2014-09-18 12:38:55.607234 7f8240512700 2 mds.0.272 boot_start 2: replaying
> mds log
> 2014-09-18 12:38:55.675025 7f823ca08700 7 mds.0.cache adjust_subtree_auth
> -1,-2 -> -2,-2 on [dir 1 / [2,head] auth v=0 cv=0/0 state=1073741824 f() n()
> hs=0+0,ss=0+0 0x5da0000]
> 2014-09-18 12:38:55.675055 7f823ca08700 7 mds.0.cache current root is [dir 1
> / [2,head] auth v=0 cv=0/0 state=1073741824 f() n() hs=0+0,ss=0+0 |
> subtree=1 0x5da0000]
> 2014-09-18 12:38:55.675065 7f823ca08700 7 mds.0.cache adjust_subtree_auth
> -1,-2 -> -2,-2 on [dir 100 ~mds0/ [2,head] auth v=0 cv=0/0 state=1073741824
> f() n() hs=0+0,ss=0+0 0x5da03b8]
> 2014-09-18 12:38:55.675076 7f823ca08700 7 mds.0.cache current root is [dir
> 100 ~mds0/ [2,head] auth v=0 cv=0/0 state=1073741824 f() n() hs=0+0,ss=0+0 |
> subtree=1 0x5da03b8]
> 2014-09-18 12:38:55.675087 7f823ca08700 7 mds.0.cache
> adjust_bounded_subtree_auth -2,-2 -> 0,-2 on [dir 1 / [2,head] auth
> v=1076158 cv=0/0 dir_auth=-2 state=1073741824 f(v0 m2014-09-09
> 17:49:20.000000 1=0+1) n(v87567 rc2014-09-16 12:44:41.750069 b1824476527135
> 31747410=31708953+38457)/n(v87567 rc2014-09-16 12:44:38.450226
> b1824464654503 31746894=31708437+38457) hs=0+0,ss=0+0 | subtree=1 0x5da0000]
> bound_dfs []
> 2014-09-18 12:38:55.675116 7f823ca08700 7 mds.0.cache
> adjust_bounded_subtree_auth -2,-2 -> 0,-2 on [dir 1 / [2,head] auth
> v=1076158 cv=0/0 dir_auth=-2 state=1073741824 f(v0 m2014-09-09
> 17:49:20.000000 1=0+1) n(v87567 rc2014-09-16 12:44:41.750069 b1824476527135
> 31747410=31708953+38457)/n(v87567 rc2014-09-16 12:44:38.450226
> b1824464654503 31746894=31708437+38457) hs=0+0,ss=0+0 | subtree=1 0x5da0000]
> bounds
> 2014-09-18 12:38:55.675129 7f823ca08700 7 mds.0.cache current root is [dir 1
> / [2,head] auth v=1076158 cv=0/0 dir_auth=-2 state=1073741824 f(v0
> m2014-09-09 17:49:20.000000 1=0+1) n(v87567 rc2014-09-16 12:44:41.750069
> b1824476527135 31747410=31708953+38457)/n(v87567 rc2014-09-16
> 12:44:38.450226 b1824464654503 31746894=31708437+38457) hs=0+0,ss=0+0 |
> subtree=1 0x5da0000]
> 2014-09-18 12:38:55.675142 7f823ca08700 7 mds.0.cache
> adjust_bounded_subtree_auth -2,-2 -> 0,-2 on [dir 100 ~mds0/ [2,head] auth
> v=1288520 cv=0/0 dir_auth=-2 state=1073741824 f(v0 11=1+10) n(v16575
> rc2014-09-15 18:01:15.766451 b2512216 82=72+10) hs=0+0,ss=0+0 | subtree=1
> 0x5da03b8] bound_dfs []
> 2014-09-18 12:38:55.675152 7f823ca08700 7 mds.0.cache
> adjust_bounded_subtree_auth -2,-2 -> 0,-2 on [dir 100 ~mds0/ [2,head] auth
> v=1288520 cv=0/0 dir_auth=-2 state=1073741824 f(v0 11=1+10) n(v16575
> rc2014-09-15 18:01:15.766451 b2512216 82=72+10) hs=0+0,ss=0+0 | subtree=1
> 0x5da03b8] bounds
> 2014-09-18 12:38:55.675160 7f823ca08700 7 mds.0.cache current root is [dir
> 100 ~mds0/ [2,head] auth v=1288520 cv=0/0 dir_auth=-2 state=1073741824 f(v0
> 11=1+10) n(v16575 rc2014-09-15 18:01:15.766451 b2512216 82=72+10)
> hs=0+0,ss=0+0 | subtree=1 0x5da03b8]
> 2014-09-18 12:38:55.675167 7f823ca08700 7 mds.0.cache recalc_auth_bits
> (replay)
> 2014-09-18 12:38:55.675171 7f823ca08700 7 mds.0.cache show_cache
> 2014-09-18 12:38:55.675172 7f823ca08700 7 mds.0.cache unlinked [inode 100
> [...2,head] ~mds0/ auth v85546 snaprealm=0x3bf1d40 f(v0 11=1+10) n(v16575
> rc2014-09-15 18:01:15.766451 b2512216 83=72+11)/n(v0 12=1+11) (iversion
> lock) | dirfrag=1 0x3d68000]
> 2014-09-18 12:38:55.675187 7f823ca08700 7 mds.0.cache dirfrag [dir 100
> ~mds0/ [2,head] auth v=1288520 cv=0/0 dir_auth=0 state=1073741824 f(v0
> 11=1+10) n(v16575 rc2014-09-15 18:01:15.766451 b2512216 82=72+10)
> hs=0+0,ss=0+0 | subtree=1 0x5da03b8]
> 2014-09-18 12:38:55.675207 7f823ca08700 7 mds.0.cache unlinked [inode 1
> [...2,head] / auth v1 snaprealm=0x3bf21c0 f() n(v0 1=0+1) (iversion lock) |
> dirfrag=1 0x3d68938]
> 2014-09-18 12:38:55.675213 7f823ca08700 7 mds.0.cache dirfrag [dir 1 /
> [2,head] auth v=1076158 cv=0/0 dir_auth=0 state=1073741824 f(v0 m2014-09-09
> 17:49:20.000000 1=0+1) n(v87567 rc2014-09-16 12:44:41.750069 b1824476527135
> 31747410=31708953+38457)/n(v87567 rc2014-09-16 12:44:38.450226
> b1824464654503 31746894=31708437+38457) hs=0+0,ss=0+0 | subtree=1 0x5da0000]
> 2014-09-18 12:38:56.089244 7f8240512700 5 mds.0.272 ms_handle_connect on
> 10.1.0.214:6801/6078
> 2014-09-18 12:39:08.795245 7f8240512700 5 mds.0.272 handle_mds_map epoch 622
> from mon.0
> 2014-09-18 12:49:15.677419 7f823dc0c700 5 mds.0.272 is_laggy 29.870013 > 15
> since last acked beacon
> 2014-09-18 12:49:15.680801 7f823dc0c700 5 mds.0.272 tick bailing out since
> we seem laggy
> 2014-09-18 12:49:15.684033 7f8240512700 5 mds.0.272 handle_mds_map epoch 623
> from mon.0
> 2014-09-18 12:49:15.684406 7f8240512700 1 mds.-1.-1 handle_mds_map i
> (10.1.0.213:6822/17501) dne in the mdsmap, respawning myself
> 2014-09-18 12:49:15.684429 7f8240512700 1 mds.-1.-1 respawn
> 2014-09-18 12:49:15.684432 7f8240512700 1 mds.-1.-1 e: 'ceph-mds'
> 2014-09-18 12:49:15.684444 7f8240512700 1 mds.-1.-1 0: 'ceph-mds'
> 2014-09-18 12:49:15.684447 7f8240512700 1 mds.-1.-1 1: '-i'
> 2014-09-18 12:49:15.684449 7f8240512700 1 mds.-1.-1 2: 'storage-1-213'
> 2014-09-18 12:49:15.684451 7f8240512700 1 mds.-1.-1 3: '-f'2014-09-18
> 12:49:16.475451 7f32533f77a0 0 ceph version 0.85
> (a0c22842db9eaee9840136784e94e50fabe77187), process ceph-mds, pid 17501
> 2014-09-18 12:49:16.487916 7f324e4f1700 5 mds.-1.0 ms_handle_connect on
> 10.1.0.213:6789/0
> 2014-09-18 12:49:16.517705 7f324e4f1700 5 mds.-1.0 handle_mds_map epoch 623
> from mon.0
> 2014-09-18 12:49:16.900142 7f324e4f1700 5 mds.-1.-1 handle_mds_map epoch 624
> from mon.0
> 2014-09-18 12:49:16.900190 7f324e4f1700 1 mds.-1.0 handle_mds_map standby
> 2014-09-18 12:52:05.654849 7f324e4f1700 5 mds.-1.0 handle_mds_map epoch 625
> from mon.0
> 2014-09-18 12:52:05.654929 7f324e4f1700 1 mds.0.274 handle_mds_map i am now
> mds.0.274
> 2014-09-18 12:52:05.654933 7f324e4f1700 1 mds.0.274 handle_mds_map state
> change up:standby --> up:replay
> 2014-09-18 12:52:05.654937 7f324e4f1700 1 mds.0.274 replay_start
> 2014-09-18 12:52:05.654954 7f324e4f1700 7 mds.0.cache set_recovery_set
> 2014-09-18 12:52:05.654960 7f324e4f1700 1 mds.0.274 recovery set is
> 2014-09-18 12:52:05.654962 7f324e4f1700 1 mds.0.274 need osdmap epoch 1933,
> have 1931
> 2014-09-18 12:52:05.654965 7f324e4f1700 1 mds.0.274 waiting for osdmap 1933
> (which blacklists prior instance)
> 2014-09-18 12:52:05.655022 7f324e4f1700 5 mds.0.274 handle_mds_failure for
> myself; not doing anything
> 2014-09-18 12:52:05.668471 7f324e4f1700 2 mds.0.274 boot_start 0: opening
> inotable
> 2014-09-18 12:52:05.668731 7f324e4f1700 2 mds.0.274 boot_start 0: opening
> sessionmap
> 2014-09-18 12:52:05.668878 7f324e4f1700 2 mds.0.274 boot_start 0: opening
> mds log
> 2014-09-18 12:52:05.668883 7f324e4f1700 5 mds.0.log open discovering log
> bounds

You would need to turn up logging quite a bit (debug mds = 20, debug
objecter = 10, debug journaled = 20 should be good) and put it
somewhere accessible. From that log all anybody can say is that your
MDS is respawning on purpose, because the monitor didn't get a
liveness report from it for such a long time.

That said, version .85 is a dev version and we have some versions in
our current Giant stabilization branch which might be in .85 as well,
but they won't be backported. It's generally best to run named
releases unless you are participating in the development and test of
new features. :)
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com