Awesome, glad a simple upgrade fixed it for you. :) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Sep 16, 2013 at 6:18 AM, Serge Slipchenko <serge.slipchenko@xxxxxxxxx> wrote: > Hi, > > Digging the web I have found similar symptoms > http://tracker.ceph.com/issues/6087 > I have found that my ceph-mds isn't updated and still is 0.67.2 that doesn't > have MDS patch. > After update to 0.67.3 MDS stabilized. > > I am terribly sorry, but I hope that my bad experience will help someone. > > On Mon, Sep 16, 2013 at 11:25 AM, Serge Slipchenko > <serge.slipchenko@xxxxxxxxx> wrote: >> >> Hi Gregory, >> >> On Sun, Sep 15, 2013 at 10:59 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote: >>> >>> What's the output of "ceph -s", and have you tried running the MDS >>> with any logging enabled that we can check out? >> >> >> See sudo ceph mds tell 0 injectargs '--debug_ms 20 --debug_mds 20' and >> sudo ceph mds tell 0 injectargs '--debug_ms 1 --debug_mds 1' >> >> sudo ceph -s >> cluster 920ff156-998f-44a9-a0c6-5bc265d4ac2e >> health HEALTH_WARN mds cluster is degraded >> monmap e7: 3 mons at >> {s01=144.76.13.102:6789/0,s02=144.76.13.103:6789/0,s03=144.76.13.105:6789/0}, >> election epoch 4680, quorum 0,1,2 s01,s02,s03 >> osdmap e7278: 16 osds: 16 up, 16 in >> pgmap v1955548: 704 pgs: 704 active+clean; 207 GB data, 426 GB used, >> 38463 GB / 40971 GB avail; 338KB/s rd, 338op/s >> mdsmap e1307: 1/1/1 up {0=m02=up:rejoin}, 1 up:standby >> >> sudo ceph mds tell 0 injectargs '--debug_ms 20 --debug_mds 20' >> >> 2013-09-16 10:15:36.724250 7f455864d700 20 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).writer sleeping >> 2013-09-16 10:15:36.724257 7f455d066700 10 mds.0.cache >> _open_ino_backtrace_fetched ino 10000003e4d errno 0 >> 2013-09-16 10:15:36.724264 7f455d066700 10 mds.0.cache old object in pool >> 1, retrying pool -1 >> 2013-09-16 10:15:36.724289 7f455d066700 1 -- 5.9.122.115:6806/29741 --> >> 144.76.13.103:6789/0 -- mon_get_version(what=osdmap handle=20931738) v1 -- >> ?+0 0x30e4540 con 0x1875c60 >> 2013-09-16 10:15:36.724296 7f455d066700 20 -- 5.9.122.115:6806/29741 >> submit_message mon_get_version(what=osdmap handle=20931738) v1 remote, >> 144.76.13.103:6789/0, have pipe. >> 2013-09-16 10:15:36.724313 7f455d066700 10 -- 5.9.122.115:6806/29741 >> dispatch_throttle_release 156 to dispatch throttler 156/104857600 >> 2013-09-16 10:15:36.724322 7f455d066700 20 -- 5.9.122.115:6806/29741 done >> calling dispatch on 0x1898000 >> 2013-09-16 10:15:36.724318 7f4559e5e700 10 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).writer: state = open policy.server=0 >> 2013-09-16 10:15:36.724337 7f4559e5e700 20 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).writer encoding 20966720 features 34359738367 0x30e4540 >> mon_get_version(what=osdmap handle=20931738) v1 >> 2013-09-16 10:15:36.724356 7f4559e5e700 20 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).writer no session security >> 2013-09-16 10:15:36.724365 7f4559e5e700 20 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).writer sending 20966720 0x30e4540 >> 2013-09-16 10:15:36.724388 7f4559e5e700 10 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).writer: state = open policy.server=0 >> 2013-09-16 10:15:36.724396 7f4559e5e700 20 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).writer sleeping >> 2013-09-16 10:15:36.725105 7f455af61700 20 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).reader got ACK >> 2013-09-16 10:15:36.725124 7f455af61700 15 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).reader got ack seq 20966720 >> 2013-09-16 10:15:36.725133 7f455af61700 20 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).reader reading tag... >> 2013-09-16 10:15:36.725143 7f455af61700 20 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).reader got MSG >> 2013-09-16 10:15:36.725152 7f455af61700 20 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).reader got envelope type=20 src mon.1 front=24 data=0 off 0 >> 2013-09-16 10:15:36.725162 7f455af61700 10 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).reader wants 24 from dispatch throttler 0/104857600 >> 2013-09-16 10:15:36.725172 7f455af61700 20 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).reader got front 24 >> 2013-09-16 10:15:36.725180 7f455af61700 10 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).aborted = 0 >> 2013-09-16 10:15:36.725187 7f455af61700 20 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).reader got 24 + 0 + 0 byte message >> 2013-09-16 10:15:36.725195 7f455af61700 10 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).No session security set >> 2013-09-16 10:15:36.725204 7f455af61700 10 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).reader got message 20966742 0x30e4e00 >> mon_check_map_ack(handle=20931738 version=7278) v2 >> 2013-09-16 10:15:36.725212 7f455af61700 20 -- 5.9.122.115:6806/29741 queue >> 0x30e4e00 prio 196 >> 2013-09-16 10:15:36.725218 7f455af61700 20 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).reader reading tag... >> 2013-09-16 10:15:36.725236 7f4559e5e700 10 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).writer: state = open policy.server=0 >> 2013-09-16 10:15:36.725241 7f455d066700 1 -- 5.9.122.115:6806/29741 <== >> mon.1 144.76.13.103:6789/0 20966742 ==== mon_check_map_ack(handle=20931738 >> version=7278) v2 ==== 24+0+0 (2168433882 0 0) 0x30e4e00 con 0x1875c60 >> 2013-09-16 10:15:36.725263 7f4559e5e700 10 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).write_ack 20966742 >> 2013-09-16 10:15:36.725275 7f455d066700 10 -- 5.9.122.115:6806/29741 >> dispatch_throttle_release 24 to dispatch throttler 24/104857600 >> 2013-09-16 10:15:36.725286 7f455d066700 20 -- 5.9.122.115:6806/29741 done >> calling dispatch on 0x30e4e00 >> 2013-09-16 10:15:36.725285 7f4559e5e700 10 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).writer: state = open policy.server=0 >> 2013-09-16 10:15:36.725296 7f455b863700 10 mds.0.cache >> _open_ino_backtrace_fetched ino 10000003e4d errno -2 >> 2013-09-16 10:15:36.725306 7f455b863700 10 mds.0.cache no object in pool >> -1, retrying pool 1 >> 2013-09-16 10:15:36.725298 7f4559e5e700 20 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).writer sleeping >> 2013-09-16 10:15:36.725326 7f455b863700 1 -- 5.9.122.115:6806/29741 --> >> 5.9.143.75:6811/25411 -- osd_op(mds.0.212:41863610 10000003e4d.00000000 >> [getxattr parent] 1.82468d56 e7278) v4 -- ?+0 0x1860480 con 0x1939b00 >> 2013-09-16 10:15:36.725341 7f455b863700 20 -- 5.9.122.115:6806/29741 >> submit_message osd_op(mds.0.212:41863610 10000003e4d.00000000 [getxattr >> parent] 1.82468d56 e7278) v4 remote, 5.9.143.75:6811/25411, have pipe. >> 2013-09-16 10:15:36.725387 7f455864d700 10 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).writer: state = open policy.server=0 >> 2013-09-16 10:15:36.725409 7f455864d700 20 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).writer encoding 20931757 features 34359738367 0x1860480 >> osd_op(mds.0.212:41863610 10000003e4d.00000000 [getxattr parent] 1.82468d56 >> e7278) v4 >> 2013-09-16 10:15:36.725468 7f455864d700 20 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).writer signed seq # 20931757): sig = 10143984293515600654 >> 2013-09-16 10:15:36.725487 7f455864d700 20 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).writer sending 20931757 0x1860480 >> 2013-09-16 10:15:36.725509 7f455864d700 10 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).writer: state = open policy.server=0 >> 2013-09-16 10:15:36.725518 7f455864d700 20 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).writer sleeping >> 2013-09-16 10:15:36.727034 7f455834a700 20 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).reader got ACK >> 2013-09-16 10:15:36.727055 7f455834a700 15 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).reader got ack seq 20931757 >> 2013-09-16 10:15:36.727064 7f455834a700 20 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).reader reading tag... >> 2013-09-16 10:15:36.727074 7f455834a700 20 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).reader got MSG >> 2013-09-16 10:15:36.727085 7f455834a700 20 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).reader got envelope type=43 src osd.9 front=119 data=37 off 0 >> 2013-09-16 10:15:36.727095 7f455834a700 10 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).reader wants 156 from dispatch throttler 0/104857600 >> 2013-09-16 10:15:36.727105 7f455834a700 20 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).reader got front 119 >> 2013-09-16 10:15:36.727113 7f455834a700 20 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).reader allocating new rx buffer at offset 0 >> 2013-09-16 10:15:36.727119 7f455834a700 20 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).reader reading nonblocking into 0x5dc5690 len 37 >> 2013-09-16 10:15:36.727129 7f455834a700 10 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).aborted = 0 >> 2013-09-16 10:15:36.727135 7f455834a700 20 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).reader got 119 + 0 + 37 byte message >> 2013-09-16 10:15:36.727180 7f455834a700 10 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).reader got message 20931757 0x1898e00 osd_op_reply(41863610 >> 10000003e4d.00000000 [getxattr (37)] ondisk = 0) v4 >> 2013-09-16 10:15:36.727192 7f455834a700 20 -- 5.9.122.115:6806/29741 queue >> 0x1898e00 prio 127 >> 2013-09-16 10:15:36.727201 7f455834a700 20 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).reader reading tag... >> 2013-09-16 10:15:36.727214 7f455d066700 1 -- 5.9.122.115:6806/29741 <== >> osd.9 5.9.143.75:6811/25411 20931757 ==== osd_op_reply(41863610 >> 10000003e4d.00000000 [getxattr (37)] ondisk = 0) v4 ==== 119+0+37 >> (2687073223 0 917274488) 0x1898e00 con 0x1939b00 >> 2013-09-16 10:15:36.727213 7f455864d700 10 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).writer: state = open policy.server=0 >> 2013-09-16 10:15:36.727241 7f455d066700 10 mds.0.cache >> _open_ino_backtrace_fetched ino 10000003e4d errno 0 >> 2013-09-16 10:15:36.727236 7f455864d700 10 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).write_ack 20931757 >> 2013-09-16 10:15:36.727246 7f455d066700 10 mds.0.cache old object in pool >> 1, retrying pool -1 >> 2013-09-16 10:15:36.727253 7f455864d700 10 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).writer: state = open policy.server=0 >> 2013-09-16 10:15:36.727262 7f455864d700 20 -- 5.9.122.115:6806/29741 >> >> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1 >> c=0x1939b00).writer sleeping >> 2013-09-16 10:15:36.727267 7f455d066700 1 -- 5.9.122.115:6806/29741 --> >> 144.76.13.103:6789/0 -- mon_get_version(what=osdmap handle=20931739) v1 -- >> ?+0 0x30e4e00 con 0x1875c60 >> 2013-09-16 10:15:36.727274 7f455d066700 20 -- 5.9.122.115:6806/29741 >> submit_message mon_get_version(what=osdmap handle=20931739) v1 remote, >> 144.76.13.103:6789/0, have pipe. >> 2013-09-16 10:15:36.727287 7f455d066700 10 -- 5.9.122.115:6806/29741 >> dispatch_throttle_release 156 to dispatch throttler 156/104857600 >> 2013-09-16 10:15:36.727292 7f455d066700 20 -- 5.9.122.115:6806/29741 done >> calling dispatch on 0x1898e00 >> 2013-09-16 10:15:36.727295 7f4559e5e700 10 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).writer: state = open policy.server=0 >> 2013-09-16 10:15:36.727322 7f4559e5e700 20 -- 5.9.122.115:6806/29741 >> >> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1 >> c=0x1875c60).writer encoding 20966721 features 34359738367 0x30e4e00 >> mon_get_version(what=osdmap handle=20931739) >> policy.server=0^C >> >> sudo ceph mds tell 0 injectargs '--debug_ms 1 --debug_mds 1' >> >> 2013-09-16 10:17:25.179033 7f455d066700 1 -- 5.9.122.115:6806/29741 --> >> 144.76.13.103:6789/0 -- mon_get_version(what=osdmap handle=20966628) v1 -- >> ?+0 0x30e4c40 con 0x1875c60 >> 2013-09-16 10:17:25.179940 7f455d066700 1 -- 5.9.122.115:6806/29741 <== >> mon.1 144.76.13.103:6789/0 21001660 ==== mon_check_map_ack(handle=20966628 >> version=7278) v2 ==== 24+0+0 (3394005587 0 0) 0x30e4e00 con 0x1875c60 >> 2013-09-16 10:17:25.180059 7f455b863700 1 -- 5.9.122.115:6806/29741 --> >> 5.9.143.75:6811/25411 -- osd_op(mds.0.212:41933390 10000003e4d.00000000 >> [getxattr parent] 1.82468d56 e7278) v4 -- ?+0 0x7a63b40 con 0x1939b00 >> 2013-09-16 10:17:25.181726 7f455d066700 1 -- 5.9.122.115:6806/29741 <== >> osd.9 5.9.143.75:6811/25411 20966647 ==== osd_op_reply(41933390 >> 10000003e4d.00000000 [getxattr (37)] ondisk = 0) v4 ==== 119+0+37 >> (2687073223 0 917274488) 0x2fd7600 con 0x1939b00 >> 2013-09-16 10:17:25.181791 7f455d066700 1 -- 5.9.122.115:6806/29741 --> >> 144.76.13.103:6789/0 -- mon_get_version(what=osdmap handle=20966629) v1 -- >> ?+0 0x30e4e00 con 0x1875c60 >> 2013-09-16 10:17:25.182655 7f455d066700 1 -- 5.9.122.115:6806/29741 <== >> mon.1 144.76.13.103:6789/0 21001661 ==== mon_check_map_ack(handle=20966629 >> version=7278) v2 ==== 24+0+0 (1879288029 0 0) 0x30e4540 con 0x1875c60 >> 2013-09-16 10:17:25.182758 7f455b863700 1 -- 5.9.122.115:6806/29741 --> >> 5.9.143.75:6811/25411 -- osd_op(mds.0.212:41933392 10000003e4d.00000000 >> [getxattr parent] 1.82468d56 e7278) v4 -- ?+0 0x7a63900 con 0x1939b00 >> 2013-09-16 10:17:25.184491 7f455d066700 1 -- 5.9.122.115:6806/29741 <== >> osd.9 5.9.143.75:6811/25411 20966648 ==== osd_op_reply(41933392 >> 10000003e4d.00000000 [getxattr (37)] ondisk = 0) v4 ==== 119+0+37 >> (2687073223 0 917274488) 0x2fd7400 con 0x1939b00 >> 2013-09-16 10:17:25.184556 7f455d066700 1 -- 5.9.122.115:6806/29741 --> >> 144.76.13.103:6789/0 -- mon_get_version(what=osdmap handle=20966630) v1 -- >> ?+0 0x30e4540 con 0x1875c60 >> 2013-09-16 10:17:25.185415 7f455d066700 1 -- 5.9.122.115:6806/29741 <== >> mon.1 144.76.13.103:6789/0 21001662 ==== mon_check_map_ack(handle=20966630 >> version=7278) v2 ==== 24+0+0 (3141507518 0 0) 0x30fcc40 con 0x1875c60 >> 2013-09-16 10:17:25.185512 7f455b863700 1 -- 5.9.122.115:6806/29741 --> >> 5.9.143.75:6811/25411 -- osd_op(mds.0.212:41933394 10000003e4d.00000000 >> [getxattr parent] 1.82468d56 e7278) v4 -- ?+0 0x7a636c0 con 0x1939b00 >> 2013-09-16 10:17:25.187119 7f455d066700 1 -- 5.9.122.115:6806/29741 <== >> osd.9 5.9.143.75:6811/25411 20966649 ==== osd_op_reply(41933394 >> 10000003e4d.00000000 [getxattr (37)] ondisk = 0) v4 ==== 119+0+37 >> (2687073223 0 917274488) 0x2fd7200 con 0x1939b00 >> 2013-09-16 10:17:25.187184 7f455d066700 1 -- 5.9.122.115:6806/29741 --> >> 144.76.13.103:6789/0 -- mon_get_version(what=osdmap handle=20966631) v1 -- >> ?+0 0x30fcc40 con 0x1875c60 >> 2013-09-16 10:17:25.188452 7f455d066700 1 -- 5.9.122.115:6806/29741 <== >> mon.1 144.76.13.103:6789/0 21001663 ==== mon_check_map_ack(handle=20966631 >> version=7278) v2 ==== 24+0+0 (24140592 0 0) 0x30fce00 con 0x1875c60 >> 2013-09-16 10:17:25.188549 7f455b863700 1 -- 5.9.122.115:6806/29741 --> >> 5.9.143.75:6811/25411 -- osd_op(mds.0.212:41933396 10000003e4d.00000000 >> [getxattr parent] 1.82468d56 e7278) v4 -- ?+0 0x7a63d80 con 0x1939b00 >> 2013-09-16 10:17:25.190273 7f455d066700 1 -- 5.9.122.115:6806/29741 <== >> osd.9 5.9.143.75:6811/25411 20966650 ==== osd_op_reply(41933396 >> 10000003e4d.00000000 [getxattr (37)] ondisk = 0) v4 ==== 119+0+37 >> (2687073223 0 917274488) 0x2fd7000 con 0x1939b00 >> 2013-09-16 10:17:25.190338 7f455d066700 1 -- 5.9.122.115:6806/29741 --> >> 144.76.13.103:6789/0 -- mon_get_version(what=osdmap handle=20966632) v1 -- >> ?+0 0x30fce00 con 0x1875c60 >> 2013-09-16 10:17:25.191054 7f455d066700 1 -- 5.9.122.115:6806/29741 <== >> mon.1 144.76.13.103:6789/0 21001664 ==== mon_check_map_ack(handle=20966632 >> version=7278) v2 ==== 24+0+0 (3901240524 0 0) 0x30fc380 con 0x1875c60 >> >> >>> >>> -Greg >>> Software Engineer #42 @ http://inktank.com | http://ceph.com >>> >>> >>> On Sun, Sep 15, 2013 at 8:24 AM, Serge Slipchenko >>> <serge.slipchenko@xxxxxxxxx> wrote: >>> > Hi, >>> > >>> > I'm testing ceph 0.67.3 (408cd61584c72c0d97b774b3d8f95c6b1b06341a) >>> > under >>> > load. >>> > My configuration has 2 mds, 3 mon and 16 osd - mon and mds are on >>> > separate >>> > servers, osd distributed on 8 servers >>> > >>> > 3 servers with several processes read and write via libcephfs. >>> > >>> > Restart of active mds leads to infinite rejoin and complete >>> > inaccessibility >>> > of the cephfs. >>> > >>> > It seems related to the bug http://tracker.ceph.com/issues/4637 >>> > >>> > -- >>> > Kind regards, Serge Slipchenko >>> > >>> > _______________________________________________ >>> > ceph-users mailing list >>> > ceph-users@xxxxxxxxxxxxxx >>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> > >> >> >> >> >> -- >> Kind regards, Serge Slipchenko > > > > > -- > Kind regards, Serge Slipchenko _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com