Re: mds stuck in rejoin

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Awesome, glad a simple upgrade fixed it for you. :)
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Mon, Sep 16, 2013 at 6:18 AM, Serge Slipchenko
<serge.slipchenko@xxxxxxxxx> wrote:
> Hi,
>
> Digging the web I have found similar symptoms
> http://tracker.ceph.com/issues/6087
> I have found that my ceph-mds isn't updated and still is 0.67.2 that doesn't
> have MDS patch.
> After update to 0.67.3 MDS stabilized.
>
> I am terribly sorry, but I hope that my bad experience will help someone.
>
> On Mon, Sep 16, 2013 at 11:25 AM, Serge Slipchenko
> <serge.slipchenko@xxxxxxxxx> wrote:
>>
>> Hi Gregory,
>>
>> On Sun, Sep 15, 2013 at 10:59 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
>>>
>>> What's the output of "ceph -s", and have you tried running the MDS
>>> with any logging enabled that we can check out?
>>
>>
>> See  sudo ceph mds tell 0 injectargs '--debug_ms 20 --debug_mds 20' and
>> sudo ceph mds tell 0 injectargs '--debug_ms 1 --debug_mds 1'
>>
>> sudo ceph -s
>>    cluster 920ff156-998f-44a9-a0c6-5bc265d4ac2e
>>    health HEALTH_WARN mds cluster is degraded
>>    monmap e7: 3 mons at
>> {s01=144.76.13.102:6789/0,s02=144.76.13.103:6789/0,s03=144.76.13.105:6789/0},
>> election epoch 4680, quorum 0,1,2 s01,s02,s03
>>    osdmap e7278: 16 osds: 16 up, 16 in
>>     pgmap v1955548: 704 pgs: 704 active+clean; 207 GB data, 426 GB used,
>> 38463 GB / 40971 GB avail; 338KB/s rd, 338op/s
>>    mdsmap e1307: 1/1/1 up {0=m02=up:rejoin}, 1 up:standby
>>
>> sudo ceph mds tell 0 injectargs '--debug_ms 20 --debug_mds 20'
>>
>> 2013-09-16 10:15:36.724250 7f455864d700 20 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).writer sleeping
>> 2013-09-16 10:15:36.724257 7f455d066700 10 mds.0.cache
>> _open_ino_backtrace_fetched ino 10000003e4d errno 0
>> 2013-09-16 10:15:36.724264 7f455d066700 10 mds.0.cache  old object in pool
>> 1, retrying pool -1
>> 2013-09-16 10:15:36.724289 7f455d066700  1 -- 5.9.122.115:6806/29741 -->
>> 144.76.13.103:6789/0 -- mon_get_version(what=osdmap handle=20931738) v1 --
>> ?+0 0x30e4540 con 0x1875c60
>> 2013-09-16 10:15:36.724296 7f455d066700 20 -- 5.9.122.115:6806/29741
>> submit_message mon_get_version(what=osdmap handle=20931738) v1 remote,
>> 144.76.13.103:6789/0, have pipe.
>> 2013-09-16 10:15:36.724313 7f455d066700 10 -- 5.9.122.115:6806/29741
>> dispatch_throttle_release 156 to dispatch throttler 156/104857600
>> 2013-09-16 10:15:36.724322 7f455d066700 20 -- 5.9.122.115:6806/29741 done
>> calling dispatch on 0x1898000
>> 2013-09-16 10:15:36.724318 7f4559e5e700 10 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).writer: state = open policy.server=0
>> 2013-09-16 10:15:36.724337 7f4559e5e700 20 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).writer encoding 20966720 features 34359738367 0x30e4540
>> mon_get_version(what=osdmap handle=20931738) v1
>> 2013-09-16 10:15:36.724356 7f4559e5e700 20 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).writer no session security
>> 2013-09-16 10:15:36.724365 7f4559e5e700 20 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).writer sending 20966720 0x30e4540
>> 2013-09-16 10:15:36.724388 7f4559e5e700 10 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).writer: state = open policy.server=0
>> 2013-09-16 10:15:36.724396 7f4559e5e700 20 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).writer sleeping
>> 2013-09-16 10:15:36.725105 7f455af61700 20 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).reader got ACK
>> 2013-09-16 10:15:36.725124 7f455af61700 15 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).reader got ack seq 20966720
>> 2013-09-16 10:15:36.725133 7f455af61700 20 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).reader reading tag...
>> 2013-09-16 10:15:36.725143 7f455af61700 20 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).reader got MSG
>> 2013-09-16 10:15:36.725152 7f455af61700 20 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).reader got envelope type=20 src mon.1 front=24 data=0 off 0
>> 2013-09-16 10:15:36.725162 7f455af61700 10 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).reader wants 24 from dispatch throttler 0/104857600
>> 2013-09-16 10:15:36.725172 7f455af61700 20 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).reader got front 24
>> 2013-09-16 10:15:36.725180 7f455af61700 10 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).aborted = 0
>> 2013-09-16 10:15:36.725187 7f455af61700 20 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).reader got 24 + 0 + 0 byte message
>> 2013-09-16 10:15:36.725195 7f455af61700 10 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).No session security set
>> 2013-09-16 10:15:36.725204 7f455af61700 10 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).reader got message 20966742 0x30e4e00
>> mon_check_map_ack(handle=20931738 version=7278) v2
>> 2013-09-16 10:15:36.725212 7f455af61700 20 -- 5.9.122.115:6806/29741 queue
>> 0x30e4e00 prio 196
>> 2013-09-16 10:15:36.725218 7f455af61700 20 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).reader reading tag...
>> 2013-09-16 10:15:36.725236 7f4559e5e700 10 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).writer: state = open policy.server=0
>> 2013-09-16 10:15:36.725241 7f455d066700  1 -- 5.9.122.115:6806/29741 <==
>> mon.1 144.76.13.103:6789/0 20966742 ==== mon_check_map_ack(handle=20931738
>> version=7278) v2 ==== 24+0+0 (2168433882 0 0) 0x30e4e00 con 0x1875c60
>> 2013-09-16 10:15:36.725263 7f4559e5e700 10 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).write_ack 20966742
>> 2013-09-16 10:15:36.725275 7f455d066700 10 -- 5.9.122.115:6806/29741
>> dispatch_throttle_release 24 to dispatch throttler 24/104857600
>> 2013-09-16 10:15:36.725286 7f455d066700 20 -- 5.9.122.115:6806/29741 done
>> calling dispatch on 0x30e4e00
>> 2013-09-16 10:15:36.725285 7f4559e5e700 10 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).writer: state = open policy.server=0
>> 2013-09-16 10:15:36.725296 7f455b863700 10 mds.0.cache
>> _open_ino_backtrace_fetched ino 10000003e4d errno -2
>> 2013-09-16 10:15:36.725306 7f455b863700 10 mds.0.cache  no object in pool
>> -1, retrying pool 1
>> 2013-09-16 10:15:36.725298 7f4559e5e700 20 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).writer sleeping
>> 2013-09-16 10:15:36.725326 7f455b863700  1 -- 5.9.122.115:6806/29741 -->
>> 5.9.143.75:6811/25411 -- osd_op(mds.0.212:41863610 10000003e4d.00000000
>> [getxattr parent] 1.82468d56 e7278) v4 -- ?+0 0x1860480 con 0x1939b00
>> 2013-09-16 10:15:36.725341 7f455b863700 20 -- 5.9.122.115:6806/29741
>> submit_message osd_op(mds.0.212:41863610 10000003e4d.00000000 [getxattr
>> parent] 1.82468d56 e7278) v4 remote, 5.9.143.75:6811/25411, have pipe.
>> 2013-09-16 10:15:36.725387 7f455864d700 10 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).writer: state = open policy.server=0
>> 2013-09-16 10:15:36.725409 7f455864d700 20 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).writer encoding 20931757 features 34359738367 0x1860480
>> osd_op(mds.0.212:41863610 10000003e4d.00000000 [getxattr parent] 1.82468d56
>> e7278) v4
>> 2013-09-16 10:15:36.725468 7f455864d700 20 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).writer signed seq # 20931757): sig = 10143984293515600654
>> 2013-09-16 10:15:36.725487 7f455864d700 20 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).writer sending 20931757 0x1860480
>> 2013-09-16 10:15:36.725509 7f455864d700 10 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).writer: state = open policy.server=0
>> 2013-09-16 10:15:36.725518 7f455864d700 20 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).writer sleeping
>> 2013-09-16 10:15:36.727034 7f455834a700 20 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).reader got ACK
>> 2013-09-16 10:15:36.727055 7f455834a700 15 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).reader got ack seq 20931757
>> 2013-09-16 10:15:36.727064 7f455834a700 20 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).reader reading tag...
>> 2013-09-16 10:15:36.727074 7f455834a700 20 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).reader got MSG
>> 2013-09-16 10:15:36.727085 7f455834a700 20 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).reader got envelope type=43 src osd.9 front=119 data=37 off 0
>> 2013-09-16 10:15:36.727095 7f455834a700 10 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).reader wants 156 from dispatch throttler 0/104857600
>> 2013-09-16 10:15:36.727105 7f455834a700 20 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).reader got front 119
>> 2013-09-16 10:15:36.727113 7f455834a700 20 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).reader allocating new rx buffer at offset 0
>> 2013-09-16 10:15:36.727119 7f455834a700 20 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).reader reading nonblocking into 0x5dc5690 len 37
>> 2013-09-16 10:15:36.727129 7f455834a700 10 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).aborted = 0
>> 2013-09-16 10:15:36.727135 7f455834a700 20 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).reader got 119 + 0 + 37 byte message
>> 2013-09-16 10:15:36.727180 7f455834a700 10 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).reader got message 20931757 0x1898e00 osd_op_reply(41863610
>> 10000003e4d.00000000 [getxattr (37)] ondisk = 0) v4
>> 2013-09-16 10:15:36.727192 7f455834a700 20 -- 5.9.122.115:6806/29741 queue
>> 0x1898e00 prio 127
>> 2013-09-16 10:15:36.727201 7f455834a700 20 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).reader reading tag...
>> 2013-09-16 10:15:36.727214 7f455d066700  1 -- 5.9.122.115:6806/29741 <==
>> osd.9 5.9.143.75:6811/25411 20931757 ==== osd_op_reply(41863610
>> 10000003e4d.00000000 [getxattr (37)] ondisk = 0) v4 ==== 119+0+37
>> (2687073223 0 917274488) 0x1898e00 con 0x1939b00
>> 2013-09-16 10:15:36.727213 7f455864d700 10 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).writer: state = open policy.server=0
>> 2013-09-16 10:15:36.727241 7f455d066700 10 mds.0.cache
>> _open_ino_backtrace_fetched ino 10000003e4d errno 0
>> 2013-09-16 10:15:36.727236 7f455864d700 10 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).write_ack 20931757
>> 2013-09-16 10:15:36.727246 7f455d066700 10 mds.0.cache  old object in pool
>> 1, retrying pool -1
>> 2013-09-16 10:15:36.727253 7f455864d700 10 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).writer: state = open policy.server=0
>> 2013-09-16 10:15:36.727262 7f455864d700 20 -- 5.9.122.115:6806/29741 >>
>> 5.9.143.75:6811/25411 pipe(0x19ac500 sd=60 :59383 s=2 pgs=345 cs=1 l=1
>> c=0x1939b00).writer sleeping
>> 2013-09-16 10:15:36.727267 7f455d066700  1 -- 5.9.122.115:6806/29741 -->
>> 144.76.13.103:6789/0 -- mon_get_version(what=osdmap handle=20931739) v1 --
>> ?+0 0x30e4e00 con 0x1875c60
>> 2013-09-16 10:15:36.727274 7f455d066700 20 -- 5.9.122.115:6806/29741
>> submit_message mon_get_version(what=osdmap handle=20931739) v1 remote,
>> 144.76.13.103:6789/0, have pipe.
>> 2013-09-16 10:15:36.727287 7f455d066700 10 -- 5.9.122.115:6806/29741
>> dispatch_throttle_release 156 to dispatch throttler 156/104857600
>> 2013-09-16 10:15:36.727292 7f455d066700 20 -- 5.9.122.115:6806/29741 done
>> calling dispatch on 0x1898e00
>> 2013-09-16 10:15:36.727295 7f4559e5e700 10 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).writer: state = open policy.server=0
>> 2013-09-16 10:15:36.727322 7f4559e5e700 20 -- 5.9.122.115:6806/29741 >>
>> 144.76.13.103:6789/0 pipe(0x18d5780 sd=42 :41897 s=2 pgs=1261 cs=1 l=1
>> c=0x1875c60).writer encoding 20966721 features 34359738367 0x30e4e00
>> mon_get_version(what=osdmap handle=20931739)
>> policy.server=0^C
>>
>> sudo ceph mds tell 0 injectargs '--debug_ms 1 --debug_mds 1'
>>
>> 2013-09-16 10:17:25.179033 7f455d066700  1 -- 5.9.122.115:6806/29741 -->
>> 144.76.13.103:6789/0 -- mon_get_version(what=osdmap handle=20966628) v1 --
>> ?+0 0x30e4c40 con 0x1875c60
>> 2013-09-16 10:17:25.179940 7f455d066700  1 -- 5.9.122.115:6806/29741 <==
>> mon.1 144.76.13.103:6789/0 21001660 ==== mon_check_map_ack(handle=20966628
>> version=7278) v2 ==== 24+0+0 (3394005587 0 0) 0x30e4e00 con 0x1875c60
>> 2013-09-16 10:17:25.180059 7f455b863700  1 -- 5.9.122.115:6806/29741 -->
>> 5.9.143.75:6811/25411 -- osd_op(mds.0.212:41933390 10000003e4d.00000000
>> [getxattr parent] 1.82468d56 e7278) v4 -- ?+0 0x7a63b40 con 0x1939b00
>> 2013-09-16 10:17:25.181726 7f455d066700  1 -- 5.9.122.115:6806/29741 <==
>> osd.9 5.9.143.75:6811/25411 20966647 ==== osd_op_reply(41933390
>> 10000003e4d.00000000 [getxattr (37)] ondisk = 0) v4 ==== 119+0+37
>> (2687073223 0 917274488) 0x2fd7600 con 0x1939b00
>> 2013-09-16 10:17:25.181791 7f455d066700  1 -- 5.9.122.115:6806/29741 -->
>> 144.76.13.103:6789/0 -- mon_get_version(what=osdmap handle=20966629) v1 --
>> ?+0 0x30e4e00 con 0x1875c60
>> 2013-09-16 10:17:25.182655 7f455d066700  1 -- 5.9.122.115:6806/29741 <==
>> mon.1 144.76.13.103:6789/0 21001661 ==== mon_check_map_ack(handle=20966629
>> version=7278) v2 ==== 24+0+0 (1879288029 0 0) 0x30e4540 con 0x1875c60
>> 2013-09-16 10:17:25.182758 7f455b863700  1 -- 5.9.122.115:6806/29741 -->
>> 5.9.143.75:6811/25411 -- osd_op(mds.0.212:41933392 10000003e4d.00000000
>> [getxattr parent] 1.82468d56 e7278) v4 -- ?+0 0x7a63900 con 0x1939b00
>> 2013-09-16 10:17:25.184491 7f455d066700  1 -- 5.9.122.115:6806/29741 <==
>> osd.9 5.9.143.75:6811/25411 20966648 ==== osd_op_reply(41933392
>> 10000003e4d.00000000 [getxattr (37)] ondisk = 0) v4 ==== 119+0+37
>> (2687073223 0 917274488) 0x2fd7400 con 0x1939b00
>> 2013-09-16 10:17:25.184556 7f455d066700  1 -- 5.9.122.115:6806/29741 -->
>> 144.76.13.103:6789/0 -- mon_get_version(what=osdmap handle=20966630) v1 --
>> ?+0 0x30e4540 con 0x1875c60
>> 2013-09-16 10:17:25.185415 7f455d066700  1 -- 5.9.122.115:6806/29741 <==
>> mon.1 144.76.13.103:6789/0 21001662 ==== mon_check_map_ack(handle=20966630
>> version=7278) v2 ==== 24+0+0 (3141507518 0 0) 0x30fcc40 con 0x1875c60
>> 2013-09-16 10:17:25.185512 7f455b863700  1 -- 5.9.122.115:6806/29741 -->
>> 5.9.143.75:6811/25411 -- osd_op(mds.0.212:41933394 10000003e4d.00000000
>> [getxattr parent] 1.82468d56 e7278) v4 -- ?+0 0x7a636c0 con 0x1939b00
>> 2013-09-16 10:17:25.187119 7f455d066700  1 -- 5.9.122.115:6806/29741 <==
>> osd.9 5.9.143.75:6811/25411 20966649 ==== osd_op_reply(41933394
>> 10000003e4d.00000000 [getxattr (37)] ondisk = 0) v4 ==== 119+0+37
>> (2687073223 0 917274488) 0x2fd7200 con 0x1939b00
>> 2013-09-16 10:17:25.187184 7f455d066700  1 -- 5.9.122.115:6806/29741 -->
>> 144.76.13.103:6789/0 -- mon_get_version(what=osdmap handle=20966631) v1 --
>> ?+0 0x30fcc40 con 0x1875c60
>> 2013-09-16 10:17:25.188452 7f455d066700  1 -- 5.9.122.115:6806/29741 <==
>> mon.1 144.76.13.103:6789/0 21001663 ==== mon_check_map_ack(handle=20966631
>> version=7278) v2 ==== 24+0+0 (24140592 0 0) 0x30fce00 con 0x1875c60
>> 2013-09-16 10:17:25.188549 7f455b863700  1 -- 5.9.122.115:6806/29741 -->
>> 5.9.143.75:6811/25411 -- osd_op(mds.0.212:41933396 10000003e4d.00000000
>> [getxattr parent] 1.82468d56 e7278) v4 -- ?+0 0x7a63d80 con 0x1939b00
>> 2013-09-16 10:17:25.190273 7f455d066700  1 -- 5.9.122.115:6806/29741 <==
>> osd.9 5.9.143.75:6811/25411 20966650 ==== osd_op_reply(41933396
>> 10000003e4d.00000000 [getxattr (37)] ondisk = 0) v4 ==== 119+0+37
>> (2687073223 0 917274488) 0x2fd7000 con 0x1939b00
>> 2013-09-16 10:17:25.190338 7f455d066700  1 -- 5.9.122.115:6806/29741 -->
>> 144.76.13.103:6789/0 -- mon_get_version(what=osdmap handle=20966632) v1 --
>> ?+0 0x30fce00 con 0x1875c60
>> 2013-09-16 10:17:25.191054 7f455d066700  1 -- 5.9.122.115:6806/29741 <==
>> mon.1 144.76.13.103:6789/0 21001664 ==== mon_check_map_ack(handle=20966632
>> version=7278) v2 ==== 24+0+0 (3901240524 0 0) 0x30fc380 con 0x1875c60
>>
>>
>>>
>>> -Greg
>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>
>>>
>>> On Sun, Sep 15, 2013 at 8:24 AM, Serge Slipchenko
>>> <serge.slipchenko@xxxxxxxxx> wrote:
>>> > Hi,
>>> >
>>> > I'm testing ceph 0.67.3 (408cd61584c72c0d97b774b3d8f95c6b1b06341a)
>>> > under
>>> > load.
>>> > My configuration has 2 mds, 3 mon and 16 osd - mon and mds are on
>>> > separate
>>> > servers, osd distributed on 8 servers
>>> >
>>> > 3 servers with several processes read and write via libcephfs.
>>> >
>>> > Restart of active mds leads to infinite rejoin and complete
>>> > inaccessibility
>>> > of the cephfs.
>>> >
>>> > It seems related to the bug http://tracker.ceph.com/issues/4637
>>> >
>>> > --
>>> > Kind regards, Serge Slipchenko
>>> >
>>> > _______________________________________________
>>> > ceph-users mailing list
>>> > ceph-users@xxxxxxxxxxxxxx
>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> >
>>
>>
>>
>>
>> --
>> Kind regards, Serge Slipchenko
>
>
>
>
> --
> Kind regards, Serge Slipchenko
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux