On Wed, Oct 7, 2015 at 11:46 PM, Dzianis Kahanovich <mahatma@xxxxxxxxxxxxxx> wrote: > John Spray пишет: > > [...] > > There are part of log for restarted mds debug 7 (without standby-replplay, > but IMHO no matter): > > (PS How [un]safe multiple mds in current hammer? Now I try & temporary work > with "set_max_mds 3", but 1 mds shutdown still too laggy for related client > anymore. And anymore 4.2.3 kernel driver buggy and at least need hard tuning > to normal work in real busy mode...) > > 2015-10-06 23:38:34.420499 7f256905a780 0 ceph version 0.94.3-242-g79385a8 > (79385a85beea9bccd82c99b6bda653f0224c4fcd), process ceph-mds, pid 17337 > 2015-10-06 23:38:34.590909 7f256905a780 -1 mds.-1.0 log_to_monitors > {default=true} > 2015-10-06 23:38:34.608101 7f2561355700 7 mds.-1.server handle_osd_map: > full = 0 epoch = 159091 > 2015-10-06 23:38:34.608457 7f2561355700 5 mds.-1.0 handle_mds_map epoch > 7241 from mon.2 > 2015-10-06 23:38:34.608558 7f2561355700 7 mds.-1.server handle_osd_map: > full = 0 epoch = 159091 > 2015-10-06 23:38:34.608614 7f2561355700 5 mds.-1.-1 handle_mds_map epoch > 7241 from mon.2 > 2015-10-06 23:38:34.608618 7f2561355700 5 mds.-1.-1 old map epoch 7241 <= > 7241, discarding > 2015-10-06 23:38:35.047873 7f2561355700 7 mds.-1.server handle_osd_map: > full = 0 epoch = 159092 > 2015-10-06 23:38:35.390461 7f2561355700 5 mds.-1.-1 handle_mds_map epoch > 7242 from mon.2 > 2015-10-06 23:38:35.390529 7f2561355700 1 mds.-1.0 handle_mds_map standby > 2015-10-06 23:38:35.607255 7f2561355700 5 mds.-1.0 handle_mds_map epoch > 7243 from mon.2 > 2015-10-06 23:38:35.607292 7f2561355700 1 mds.0.688 handle_mds_map i am now > mds.0.688 > 2015-10-06 23:38:35.607310 7f2561355700 1 mds.0.688 handle_mds_map state > change up:standby --> up:replay > 2015-10-06 23:38:35.607313 7f2561355700 1 mds.0.688 replay_start > 2015-10-06 23:38:35.607316 7f2561355700 7 mds.0.cache set_recovery_set > 2015-10-06 23:38:35.607318 7f2561355700 1 mds.0.688 recovery set is > 2015-10-06 23:38:35.607321 7f2561355700 2 mds.0.688 boot_start 0: opening > inotable > 2015-10-06 23:38:35.648096 7f2561355700 2 mds.0.688 boot_start 0: opening > sessionmap > 2015-10-06 23:38:35.648227 7f2561355700 2 mds.0.688 boot_start 0: opening > mds log > 2015-10-06 23:38:35.648230 7f2561355700 5 mds.0.log open discovering log > bounds > 2015-10-06 23:38:35.648281 7f2561355700 2 mds.0.688 boot_start 0: opening > snap table > 2015-10-06 23:38:35.695211 7f255bf47700 4 mds.0.log Waiting for journal 300 > to recover... > 2015-10-06 23:38:35.699398 7f255bf47700 4 mds.0.log Journal 300 recovered. > 2015-10-06 23:38:35.699408 7f255bf47700 4 mds.0.log Recovered journal 300 > in format 1 > 2015-10-06 23:38:35.699413 7f255bf47700 2 mds.0.688 boot_start 1: > loading/discovering base inodes > 2015-10-06 23:38:35.699501 7f255bf47700 0 mds.0.cache creating system inode > with ino:100 > 2015-10-06 23:38:35.716269 7f255bf47700 0 mds.0.cache creating system inode > with ino:1 > 2015-10-06 23:38:35.851761 7f255eb50700 2 mds.0.688 boot_start 2: replaying > mds log > 2015-10-06 23:38:36.272808 7f255bf47700 7 mds.0.cache adjust_subtree_auth > -1,-2 -> -2,-2 on [dir 1 / [2,head] auth v=0 cv=0/0 state=1073741824 f() n() > hs=0+0,ss=0+0 0x5cde000] > 2015-10-06 23:38:36.272837 7f255bf47700 7 mds.0.cache current root is [dir > 1 / [2,head] auth v=0 cv=0/0 state=1073741824 f() n() hs=0+0,ss=0+0 | > subtree=1 0x5cde000] > 2015-10-06 23:38:36.272849 7f255bf47700 7 mds.0.cache adjust_subtree_auth > -1,-2 -> -2,-2 on [dir 100 ~mds0/ [2,head] auth v=0 cv=0/0 state=1073741824 > f() n() hs=0+0,ss=0+0 0x5cde3c0] > 2015-10-06 23:38:36.272855 7f255bf47700 7 mds.0.cache current root is [dir > 100 ~mds0/ [2,head] auth v=0 cv=0/0 state=1073741824 f() n() hs=0+0,ss=0+0 | > subtree=1 0x5cde3c0] > 2015-10-06 23:38:36.282049 7f255bf47700 7 mds.0.cache > adjust_bounded_subtree_auth -2,-2 -> 0,-2 on [dir 1 / [2,head] auth > v=24837995 cv=0/0 dir_auth=-2 state=1073741824 f(v175 m2015-10-05 > 14:34:45.120292 73=45+28) n(v2043504 rc2015-10-06 23:16:58.837591 > b4465915097499 1125727=964508+161219)/n(v2043504 rc2015-10-06 > 23:16:46.667354 b4465915097417 1125726=964507+161219) hs=0+0,ss=0+0 | > subtree=1 0x5cde000] bound_dfs [] > 2015-10-06 23:38:36.282086 7f255bf47700 7 mds.0.cache > adjust_bounded_subtree_auth -2,-2 -> 0,-2 on [dir 1 / [2,head] auth > v=24837995 cv=0/0 dir_auth=-2 state=1073741824 f(v175 m2015-10-05 > 14:34:45.120292 73=45+28) n(v2043504 rc2015-10-06 23:16:58.837591 > b4465915097499 1125727=964508+161219)/n(v2043504 rc2015-10-06 > 23:16:46.667354 b4465915097417 1125726=964507+161219) hs=0+0,ss=0+0 | > subtree=1 0x5cde000] bounds > 2015-10-06 23:38:36.282100 7f255bf47700 7 mds.0.cache current root is [dir > 1 / [2,head] auth v=24837995 cv=0/0 dir_auth=-2 state=1073741824 f(v175 > m2015-10-05 14:34:45.120292 73=45+28) n(v2043504 rc2015-10-06 > 23:16:58.837591 b4465915097499 1125727=964508+161219)/n(v2043504 > rc2015-10-06 23:16:46.667354 b4465915097417 1125726=964507+161219) > hs=0+0,ss=0+0 | subtree=1 0x5cde000] > 2015-10-06 23:38:36.282147 7f255bf47700 7 mds.0.cache > adjust_bounded_subtree_auth -2,-2 -> 0,-2 on [dir 100 ~mds0/ [2,head] auth > v=28417783 cv=0/0 dir_auth=-2 state=1073741824 f(v0 11=1+10) n(v821531 > rc2015-10-06 23:16:58.847591 b2144931 594=584+10)/n(v821531 rc2015-10-06 > 23:14:18.064458 b2136353 593=583+10) hs=0+0,ss=0+0 | subtree=1 0x5cde3c0] > bound_dfs [] > 2015-10-06 23:38:36.282161 7f255bf47700 7 mds.0.cache > adjust_bounded_subtree_auth -2,-2 -> 0,-2 on [dir 100 ~mds0/ [2,head] auth > v=28417783 cv=0/0 dir_auth=-2 state=1073741824 f(v0 11=1+10) n(v821531 > rc2015-10-06 23:16:58.847591 b2144931 594=584+10)/n(v821531 rc2015-10-06 > 23:14:18.064458 b2136353 593=583+10) hs=0+0,ss=0+0 | subtree=1 0x5cde3c0] > bounds > 2015-10-06 23:38:36.282171 7f255bf47700 7 mds.0.cache current root is [dir > 100 ~mds0/ [2,head] auth v=28417783 cv=0/0 dir_auth=-2 state=1073741824 f(v0 > 11=1+10) n(v821531 rc2015-10-06 23:16:58.847591 b2144931 > 594=584+10)/n(v821531 rc2015-10-06 23:14:18.064458 b2136353 593=583+10) > hs=0+0,ss=0+0 | subtree=1 0x5cde3c0] > 2015-10-06 23:38:36.282180 7f255bf47700 7 mds.0.cache recalc_auth_bits > (replay) > 2015-10-06 23:38:36.282183 7f255bf47700 7 mds.0.cache show_cache > 2015-10-06 23:38:36.282195 7f255bf47700 7 mds.0.cache unlinked [inode 100 > [...2,head] ~mds0/ auth v3265584 snaprealm=0x38be880 f(v0 11=1+10) n(v817982 > rc2015-10-06 08:04:17.159725 b1990579 489=476+13)/n(v0 11=1+10) (iversion > lock) | dirfrag=1 0x3ab8000] > 2015-10-06 23:38:36.282210 7f255bf47700 7 mds.0.cache dirfrag [dir 100 > ~mds0/ [2,head] auth v=28417783 cv=0/0 dir_auth=0 state=1073741824 f(v0 > 11=1+10) n(v821531 rc2015-10-06 23:16:58.847591 b2144931 > 594=584+10)/n(v821531 rc2015-10-06 23:14:18.064458 b2136353 593=583+10) > hs=0+0,ss=0+0 | subtree=1 0x5cde3c0] > 2015-10-06 23:38:36.282226 7f255bf47700 7 mds.0.cache unlinked [inode 1 > [...2,head] / auth v10215492 snaprealm=0x38beac0 f(v175 m2015-09-17 > 16:18:51.151569 69=41+28) n(v1901802 rc2015-09-18 03:16:52.300905 > b5226530189072 205909=160799+45110)/n(v0 1=0+1) (iversion lock) | dirfrag=1 > 0x3ab89a0] > 2015-10-06 23:38:36.282236 7f255bf47700 7 mds.0.cache dirfrag [dir 1 / > [2,head] auth v=24837995 cv=0/0 dir_auth=0 state=1073741824 f(v175 > m2015-10-05 14:34:45.120292 73=45+28) n(v2043504 rc2015-10-06 > 23:16:58.837591 b4465915097499 1125727=964508+161219)/n(v2043504 > rc2015-10-06 23:16:46.667354 b4465915097417 1125726=964507+161219) > hs=0+0,ss=0+0 | subtree=1 0x5cde000] > 2015-10-06 23:38:36.652925 7f255bf47700 0 log_channel(cluster) log [WRN] : > replayed op client.5330402:205687,205685 used ino 10000a6cc65 but session > next is 10000a666f7 > 2015-10-06 23:38:36.652992 7f255bf47700 0 log_channel(cluster) log [WRN] : > replayed op client.5330402:205691,205685 used ino 10000a6cc66 but session > next is 10000a666f7 > 2015-10-06 23:38:36.653067 7f255bf47700 0 log_channel(cluster) log [WRN] : > replayed op client.5330402:205693,205685 used ino 10000a6cc67 but session > next is 10000a666f7 > 2015-10-06 23:38:36.767359 7f255bf47700 1 mds.0.688 replay_done > 2015-10-06 23:38:36.767372 7f255bf47700 1 mds.0.688 making mds journal > writeable > 2015-10-06 23:38:36.767824 7f255bf47700 2 mds.0.688 i am alone, moving to > state reconnect > 2015-10-06 23:38:36.767827 7f255bf47700 3 mds.0.688 request_state > up:reconnect > 2015-10-06 23:38:38.153519 7f2559a30700 0 -- 10.227.227.104:6801/17337 >> > 10.227.227.104:0/928408192 pipe(0x546f000 sd=29 :6801 s=0 pgs=0 cs=0 l=0 > c=0x38e6880).accept peer addr is really 10.227.227.104:0/928408192 (socket > is 10.227.227.104:44734/0) > 2015-10-06 23:38:38.153764 7f255992f700 0 -- 10.227.227.104:6801/17337 >> > 10.227.227.106:0/2753586590 pipe(0x5480000 sd=30 :6801 s=0 pgs=0 cs=0 l=0 > c=0x38e6700).accept peer addr is really 10.227.227.106:0/2753586590 (socket > is 10.227.227.106:40232/0) > 2015-10-06 23:38:38.154183 7f2561355700 7 mds.0.server > handle_client_reconnect client.5293917 > 2015-10-06 23:38:38.154392 7f2561355700 7 mds.0.server > handle_client_reconnect client.5293920 > 2015-10-06 23:38:38.154676 7f2561355700 7 mds.0.server > handle_client_reconnect client.5297061 > 2015-10-06 23:38:38.154927 7f2561355700 7 mds.0.server > handle_client_reconnect client.5299230 > 2015-10-06 23:38:38.159228 7f2561355700 7 mds.0.server > handle_client_reconnect client.5285889 > 2015-10-06 23:38:38.160108 7f2559026700 0 -- 10.227.227.104:6801/17337 >> > 10.227.227.103:0/4210455448 pipe(0x54a4000 sd=34 :6801 s=0 pgs=0 cs=0 l=0 > c=0x38e6b80).accept peer addr is really 10.227.227.103:0/4210455448 (socket > is 10.227.227.103:49749/0) > 2015-10-06 23:38:38.160243 7f2558f25700 0 -- 10.227.227.104:6801/17337 >> > 10.227.227.101:0/3087077443 pipe(0x54a9000 sd=35 :6801 s=0 pgs=0 cs=0 l=0 > c=0x38e6d00).accept peer addr is really 10.227.227.101:0/3087077443 (socket > is 10.227.227.101:44073/0) > 2015-10-06 23:38:38.160340 7f2558d23700 0 -- 10.227.227.104:6801/17337 >> > 10.227.227.102:0/1177330677 pipe(0x54bb000 sd=37 :6801 s=0 pgs=0 cs=0 l=0 > c=0x38e7000).accept peer addr is really 10.227.227.102:0/1177330677 (socket > is 10.227.227.102:35575/0) > 2015-10-06 23:38:38.160723 7f2561355700 7 mds.0.server > handle_client_reconnect client.5294037 > 2015-10-06 23:38:38.160865 7f2561355700 7 mds.0.server > handle_client_reconnect client.5303329 > 2015-10-06 23:38:38.160932 7f2561355700 7 mds.0.server > handle_client_reconnect client.3093290 > 2015-10-06 23:38:38.161235 7f2561355700 7 mds.0.server > handle_client_reconnect client.5300956 > 2015-10-06 23:38:38.161808 7f2561355700 7 mds.0.server > handle_client_reconnect client.5306077 > 2015-10-06 23:38:38.162426 7f2561355700 7 mds.0.server > handle_client_reconnect client.3102997 > 2015-10-06 23:38:38.300492 7f2561355700 5 mds.0.688 handle_mds_map epoch > 7244 from mon.2 > 2015-10-06 23:38:38.300564 7f2561355700 1 mds.0.688 handle_mds_map i am now > mds.0.688 > 2015-10-06 23:38:38.300571 7f2561355700 1 mds.0.688 handle_mds_map state > change up:replay --> up:reconnect > 2015-10-06 23:38:38.300574 7f2561355700 1 mds.0.688 reconnect_start > 2015-10-06 23:38:38.300576 7f2561355700 1 mds.0.688 reopen_log > 2015-10-06 23:38:38.330199 7f2561355700 1 mds.0.server reconnect_clients -- > 16 sessions > 2015-10-06 23:38:38.330219 7f2561355700 7 mds.0.server > handle_client_reconnect client.5293917 > 2015-10-06 23:38:38.330262 7f2561355700 0 log_channel(cluster) log [DBG] : > reconnect by client.5293917 10.227.227.104:0/928408192 after 0.000023 > 2015-10-06 23:38:38.330330 7f2561355700 7 mds.0.server > handle_client_reconnect client.5293920 > 2015-10-06 23:38:38.330352 7f2561355700 0 log_channel(cluster) log [DBG] : > reconnect by client.5293920 10.227.227.106:0/2753586590 after 0.000137 > 2015-10-06 23:38:38.330362 7f2561355700 7 mds.0.server > handle_client_reconnect client.5297061 > 2015-10-06 23:38:38.330375 7f2561355700 0 log_channel(cluster) log [DBG] : > reconnect by client.5297061 10.227.227.104:0/2947571808 after 0.000165 > 2015-10-06 23:38:38.330388 7f2561355700 7 mds.0.server > handle_client_reconnect client.5299230 > 2015-10-06 23:38:38.330402 7f2561355700 0 log_channel(cluster) log [DBG] : > reconnect by client.5299230 10.227.227.104:0/4118952172 after 0.000191 > 2015-10-06 23:38:38.330445 7f2561355700 7 mds.0.server > handle_client_reconnect client.5285889 > 2015-10-06 23:38:38.330463 7f2561355700 0 log_channel(cluster) log [DBG] : > reconnect by client.5285889 10.227.227.101:0/20130 after 0.000251 > 2015-10-06 23:38:38.330474 7f2561355700 7 mds.0.server > handle_client_reconnect client.5294037 > 2015-10-06 23:38:38.330487 7f2561355700 0 log_channel(cluster) log [DBG] : > reconnect by client.5294037 10.227.227.103:0/4210455448 after 0.000277 > 2015-10-06 23:38:38.330497 7f2561355700 7 mds.0.server > handle_client_reconnect client.5303329 > 2015-10-06 23:38:38.330509 7f2561355700 0 log_channel(cluster) log [DBG] : > reconnect by client.5303329 10.227.227.101:0/3087077443 after 0.000300 > 2015-10-06 23:38:38.330527 7f2561355700 7 mds.0.server > handle_client_reconnect client.3093290 > 2015-10-06 23:38:38.330541 7f2561355700 0 log_channel(cluster) log [DBG] : > reconnect by client.3093290 10.227.227.102:0/1177330677 after 0.000330 > 2015-10-06 23:38:38.330561 7f2561355700 7 mds.0.server > handle_client_reconnect client.5300956 > 2015-10-06 23:38:38.330573 7f2561355700 0 log_channel(cluster) log [DBG] : > reconnect by client.5300956 10.227.227.106:0/109306 after 0.000363 > 2015-10-06 23:38:38.330581 7f2561355700 7 mds.0.server > handle_client_reconnect client.5306077 > 2015-10-06 23:38:38.330593 7f2561355700 0 log_channel(cluster) log [DBG] : > reconnect by client.5306077 10.227.227.104:0/2137 after 0.000383 > 2015-10-06 23:38:38.330601 7f2561355700 7 mds.0.server > handle_client_reconnect client.3102997 > 2015-10-06 23:38:38.330614 7f2561355700 0 log_channel(cluster) log [DBG] : > reconnect by client.3102997 10.227.227.102:0/8221 after 0.000403 > 2015-10-06 23:38:38.330650 7f2561355700 5 mds.0.bal rebalance done > 2015-10-06 23:38:38.330951 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:38:38.330967 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:38:38.331075 7f2561355700 7 mds.0.server > handle_client_reconnect client.5330090 > 2015-10-06 23:38:38.331090 7f2561355700 0 log_channel(cluster) log [DBG] : > reconnect by client.5330090 10.227.227.104:0/3762438977 after 0.000878 > 2015-10-06 23:38:38.331138 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:38:38.331148 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:38:38.331254 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:38:38.331454 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:38:38.331500 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:38:38.331602 7f2561355700 7 mds.0.server > handle_client_reconnect client.5325905 > 2015-10-06 23:38:38.331616 7f2561355700 0 log_channel(cluster) log [DBG] : > reconnect by client.5325905 10.227.227.103:0/17787 after 0.001405 > 2015-10-06 23:38:38.332393 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:38:38.361193 7f2561355700 3 mds.0.server queuing replayed op > 2015-10-06 23:38:38.361286 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:38:38.361360 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:38:38.390186 7f2557e14700 0 -- 10.227.227.104:6801/17337 >> > 10.227.227.200:0/2048866910 pipe(0x3b87000 sd=43 :6801 s=0 pgs=0 cs=0 l=0 > c=0x38e7900).accept peer addr is really 10.227.227.200:0/2048866910 (socket > is 10.227.227.200:42145/0) > 2015-10-06 23:38:38.390688 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:38:38.390751 7f2561355700 7 mds.0.server > handle_client_reconnect client.5325809 > 2015-10-06 23:38:38.390775 7f2561355700 0 log_channel(cluster) log [DBG] : > reconnect by client.5325809 10.227.227.200:0/2048866910 after 0.060554 > 2015-10-06 23:38:38.396656 7f2561355700 7 mds.0.server > handle_client_reconnect client.5330402 > 2015-10-06 23:38:38.396672 7f2561355700 0 log_channel(cluster) log [DBG] : > reconnect by client.5330402 10.227.227.204:0/3522816523 after 0.066459 > 2015-10-06 23:38:38.629573 7f2561355700 3 mds.0.server queuing replayed op > 2015-10-06 23:38:38.629608 7f2561355700 3 mds.0.server queuing replayed op > 2015-10-06 23:38:38.629622 7f2561355700 3 mds.0.server queuing replayed op > 2015-10-06 23:38:38.629639 7f2561355700 3 mds.0.server queuing replayed op > 2015-10-06 23:38:38.629661 7f2561355700 3 mds.0.server queuing replayed op > 2015-10-06 23:38:38.629679 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:38:38.629698 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:38:38.629753 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:38:38.629779 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:38:38.629792 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:38:38.629845 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:38:38.629887 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:38:38.629943 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:38:38.629962 7f2561355700 7 mds.0.locker handle_client_caps > on 10000a68438 follows 1 op flush > 2015-10-06 23:38:38.629976 7f2561355700 7 mds.0.locker handle_client_caps > on 10000a6843a follows 1 op flush > 2015-10-06 23:38:38.630088 7f2561355700 7 mds.0.locker handle_client_caps > on 10000a6843c follows 1 op flush > 2015-10-06 23:38:38.630104 7f2561355700 7 mds.0.locker handle_client_caps > on 10000a68439 follows 1 op flush > 2015-10-06 23:38:38.630119 7f2561355700 7 mds.0.locker handle_client_caps > on 10000a6843b follows 1 op flush > 2015-10-06 23:38:38.858273 7f2561355700 7 mds.0.locker handle_client_caps > on 10000a348a2 follows 0 op update > 2015-10-06 23:38:38.858329 7f2561355700 7 mds.0.locker handle_client_caps > on 10000a348a2 follows 0 op update > 2015-10-06 23:38:38.931186 7f2561355700 7 mds.0.locker handle_client_caps > on 10000a35092 follows 0 op update > 2015-10-06 23:38:38.958907 7f2561355700 7 mds.0.locker handle_client_caps > on 10000a35092 follows 0 op update > 2015-10-06 23:38:39.111386 7f2561355700 7 mds.0.locker handle_client_caps > on 10000a350ed follows 0 op update > 2015-10-06 23:38:39.111791 7f2561355700 7 mds.0.locker handle_client_caps > on 10000a350ed follows 0 op update > 2015-10-06 23:38:39.157380 7f2561355700 7 mds.0.locker handle_client_caps > on 10000a348a2 follows 0 op update > 2015-10-06 23:38:39.291335 7f2561355700 7 mds.0.server > handle_client_reconnect client.5298246 > 2015-10-06 23:38:39.291369 7f2561355700 0 log_channel(cluster) log [DBG] : > reconnect by client.5298246 10.227.227.205:0/3532402607 after 0.961140 It seems you have 16 mounts. Are you using kernel client or fuse client, which version are they? Yan, Zheng > 2015-10-06 23:38:39.725442 7f2561355700 7 mds.0.server > reconnect_gather_finish. failed on 0 clients > 2015-10-06 23:38:39.725454 7f2561355700 1 mds.0.688 reconnect_done > 2015-10-06 23:38:39.725455 7f2561355700 3 mds.0.688 request_state up:rejoin > 2015-10-06 23:38:39.860274 7f2561355700 7 mds.0.locker handle_client_caps > on 10000a68440 follows 1 op update > 2015-10-06 23:38:39.860333 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:38:39.860343 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:38:39.860364 7f2561355700 7 mds.0.locker handle_client_caps > on 10000a68442 follows 1 op update > 2015-10-06 23:38:39.860381 7f2561355700 7 mds.0.locker handle_client_caps > on 10000a68441 follows 1 op update > 2015-10-06 23:38:39.914539 7f2561355700 7 mds.0.locker handle_client_caps > on 10000a474e9 follows 0 op update > 2015-10-06 23:38:40.327674 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:38:40.726731 7f2561355700 5 mds.0.688 handle_mds_map epoch > 7245 from mon.2 > 2015-10-06 23:38:40.726778 7f2561355700 1 mds.0.688 handle_mds_map i am now > mds.0.688 > 2015-10-06 23:38:40.726780 7f2561355700 1 mds.0.688 handle_mds_map state > change up:reconnect --> up:rejoin > 2015-10-06 23:38:40.726783 7f2561355700 1 mds.0.688 rejoin_start > 2015-10-06 23:38:52.802698 7f2561355700 1 mds.0.688 rejoin_joint_start > 2015-10-06 23:38:52.802702 7f2561355700 7 mds.0.cache rejoin_send_rejoins > still processing imported caps, delaying > 2015-10-06 23:38:52.802716 7f2561355700 5 mds.0.bal rebalance done > 2015-10-06 23:38:52.803065 7f2561355700 7 mds.0.locker handle_client_caps > on 100009b0da6 follows 0 op update > 2015-10-06 23:38:52.803091 7f2561355700 7 mds.0.locker handle_client_caps > on 10000a68444 follows 1 op update > 2015-10-06 23:38:52.803511 7f255eb50700 7 mds.0.cache.dir(1000002e7cf) > already fetching; waiting > 2015-10-06 23:38:52.803614 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:38:52.803636 7f255eb50700 7 mds.0.cache.dir(1000002e7cf) > already fetching; waiting > 2015-10-06 23:38:52.803718 7f255eb50700 7 mds.0.cache.dir(1000002e7cf) > already fetching; waiting > 2015-10-06 23:38:52.803796 7f255eb50700 7 mds.0.cache.dir(1000002e7cf) > already fetching; waiting > 2015-10-06 23:38:52.803842 7f255eb50700 7 mds.0.cache.dir(1000002e7cf) > already fetching; waiting > 2015-10-06 23:38:52.804182 7f255eb50700 7 mds.0.cache.dir(1000002e7cf) > already fetching; waiting > 2015-10-06 23:38:52.804193 7f255eb50700 7 mds.0.cache.dir(1000002e7cf) > already fetching; waiting > [...] > 2015-10-06 23:43:40.232678 7f255d24c700 7 mds.0.688 mds has 3 queued > contexts > 2015-10-06 23:43:40.254516 7f255eb50700 7 mds.0.cache.dir(100009a7cdd) > already fetching; waiting > 2015-10-06 23:43:40.282586 7f255eb50700 7 mds.0.cache.dir(100009e3a8c) > already fetching; waiting > 2015-10-06 23:43:40.327519 7f2561355700 3 mds.0.server not active yet, > waiting > 2015-10-06 23:43:40.344125 7f255eb50700 7 mds.0.cache.dir(100009aca65) > already fetching; waiting > 2015-10-06 23:43:40.382397 7f255eb50700 7 mds.0.cache.dir(100009a8068) > already fetching; waiting > 2015-10-06 23:43:40.406319 7f255eb50700 7 mds.0.cache.dir(100009e2ba3) > already fetching; waiting > 2015-10-06 23:43:40.439557 7f255eb50700 7 mds.0.cache.dir(100009a84ee) > already fetching; waiting > 2015-10-06 23:43:40.481450 7f255eb50700 7 mds.0.cache.dir(100009c5531) > already fetching; waiting > 2015-10-06 23:43:40.522600 7f255eb50700 7 mds.0.cache.dir(100009a4a3e) > already fetching; waiting > 2015-10-06 23:43:40.537671 7f255eb50700 7 mds.0.cache.dir(100009acc63) > already fetching; waiting > 2015-10-06 23:43:40.539612 7f255eb50700 7 mds.0.cache.dir(100009dc614) > already fetching; waiting > 2015-10-06 23:43:40.610772 7f255eb50700 7 mds.0.cache.dir(100009dc614) > already fetching; waiting > 2015-10-06 23:43:40.640233 7f255eb50700 7 mds.0.cache.dir(100009f2867) > already fetching; waiting > 2015-10-06 23:43:40.647848 7f255eb50700 7 mds.0.cache.dir(100009d0739) > already fetching; waiting > 2015-10-06 23:43:40.663730 7f255eb50700 7 mds.0.cache.dir(100009a8776) > already fetching; waiting > 2015-10-06 23:43:40.723566 7f255eb50700 7 mds.0.cache.dir(100009de1cd) > already fetching; waiting > 2015-10-06 23:43:40.747002 7f255eb50700 7 mds.0.cache.dir(100009c8188) > already fetching; waiting > 2015-10-06 23:43:40.780093 7f255eb50700 7 mds.0.cache.dir(10000a404d7) > already fetching; waiting > 2015-10-06 23:43:40.823139 7f255eb50700 7 mds.0.cache.dir(100009a90de) > already fetching; waiting > 2015-10-06 23:43:40.835196 7f255eb50700 7 mds.0.cache.dir(100009c5531) > already fetching; waiting > 2015-10-06 23:43:40.841444 7f255eb50700 7 mds.0.cache.dir(100009dc19c) > already fetching; waiting > 2015-10-06 23:43:40.875112 7f255eb50700 7 mds.0.cache.dir(100009e3a8c) > already fetching; waiting > 2015-10-06 23:43:40.890399 7f255eb50700 7 mds.0.cache.dir(100009e3a8c) > already fetching; waiting > 2015-10-06 23:43:40.929537 7f255eb50700 7 mds.0.cache.dir(100009a66a4) > already fetching; waiting > 2015-10-06 23:43:40.936432 7f255eb50700 7 mds.0.cache.dir(100009c8188) > already fetching; waiting > 2015-10-06 23:43:40.975802 7f255ca4b700 -1 mds.0.688 *** got signal > Terminated *** > > > > -- > WBR, Dzianis Kahanovich AKA Denis Kaganovich, http://mahatma.bspu.unibel.by/ > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com