Hi list, I have got the following log when I running test on top of Ceph. Seems this part of codes are quite fresh(not yet appear in 0.56.1), any idea about what happen? pgs=714 cs=11 l=0).reader got old message 1 <= 6 0x4552800 osd_map(363..375 src has 1..375) v3, discarding 2013-01-25 14:12:53.979018 7f192cd96700 0 -- 192.101.11.124:6804/10783 >> 192.101.11.121:6849/17270 pipe(0xe7ad400 sd=245 :6804 s=2 pgs=486 cs=11 l=0).fault with nothing to send, going to standby 2013-01-25 14:12:54.987534 7f192bf88700 0 -- 192.101.11.124:6804/10783 >> 192.101.11.125:6804/80050 pipe(0x3a12000 sd=48 :41394 s=2 pgs=4275 cs=3 l=0).fault, initiating reconnect 2013-01-25 14:12:54.989279 7f1947859700 -1 osd.510 369 heartbeat_check: no reply from osd.37 since 2013-01-25 14:12:17.748513 (cutof f 2013-01-25 14:12:34.989276) 2013-01-25 14:12:54.989428 7f192bf88700 0 -- 192.101.11.124:6804/10783 >> 192.101.11.125:6804/80050 pipe(0x3a12000 sd=47 :41473 s=2 pgs=4281 cs=5 l=0).reader got old message 1 <= 2 0x740e800 osd_map(363..377 src has 1..377) v3, discarding 2013-01-25 14:12:54.997166 7f192945d700 0 -- 192.101.11.124:6804/10783 >> 192.101.11.121:6872/16641 pipe(0x1b01180 sd=27 :38664 s=2 pgs=714 cs=11 l=0).fault, initiating reconnect 2013-01-25 14:12:54.997212 7f192bf88700 0 -- 192.101.11.124:6804/10783 >> 192.101.11.125:6804/80050 pipe(0x3a12000 sd=47 :41473 s=2 pgs=4281 cs=5 l=0).fault, initiating reconnect 2013-01-25 14:12:54.998554 7f192945d700 0 -- 192.101.11.124:6804/10783 >> 192.101.11.121:6872/16641 pipe(0x1b01180 sd=27 :38743 s=2 pgs=715 cs=13 l=0).reader got old message 1 <= 6 0x4850c00 osd_map(370..377 src has 1..377) v3, discarding 2013-01-25 14:12:54.998565 7f192bf88700 0 -- 192.101.11.124:6804/10783 >> 192.101.11.125:6804/80050 pipe(0x3a12000 sd=47 :41477 s=2 pgs=4282 cs=7 l=0).reader got old message 1 <= 2 0x8d4da00 osd_map(370..377 src has 1..377) v3, discarding 2013-01-25 14:12:55.202858 7f193f849700 0 log [WRN] : map e377 wrongly marked me down 2013-01-25 14:12:58.985680 7f1929d66700 0 -- 192.101.11.124:6874/10783 >> 192.101.11.120:6831/1580 pipe(0x390cc80 sd=38 :60934 s=2 pgs=535 cs=1 l=0).fault with nothing to send, going to standby 2013-01-25 14:12:59.030720 7f1935b23700 0 -- 192.101.11.124:6874/10783 >> 192.101.11.121:6872/16641 pipe(0x7a7b680 sd=58 :38862 s=2 pgs=725 cs=1 l=0).fault, initiating reconnect 2013-01-25 14:12:59.031210 7f1935a22700 0 -- 192.101.11.124:6874/10783 >> 192.101.11.125:6804/80050 pipe(0x1b95180 sd=59 :41596 s=2 pgs=4285 cs=1 l=0).fault, initiating reconnect 2013-01-25 14:12:59.032473 7f1935b23700 0 -- 192.101.11.124:6874/10783 >> 192.101.11.121:6872/16641 pipe(0x7a7b680 sd=58 :38864 s=2 pgs=726 cs=3 l=0).fault with nothing to send, going to standby 2013-01-25 14:12:59.032721 7f1935a22700 0 -- 192.101.11.124:6874/10783 >> 192.101.11.125:6804/80050 pipe(0x1b95180 sd=59 :41598 s=2 pgs=4286 cs=3 l=0).fault with nothing to send, going to standby 2013-01-25 14:12:59.131044 7f193b040700 -1 osd/PG.cc: In function 'PG::RecoveryState::Crashed::Crashed(boost::statechart::state<PG:: RecoveryState::Crashed, PG::RecoveryState::RecoveryMachine>::my_context)' thread 7f193b040700 time 2013-01-25 14:12:59.030259 osd/PG.cc: 5235: FAILED assert(0 == "we got a bad state machine event") ceph version 0.56-417-g67c7757 (67c77577bdfe4985aa50e91986677c742b7cc85f) 1: (PG::RecoveryState::Crashed::Crashed(boost::statechart::state<PG::RecoveryState::Crashed, PG::RecoveryState::RecoveryMachine, bo ost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::n a, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::my_context) +0xc4) [0x740c84] 2: /usr/bin/ceph-osd() [0x77f189] 3: (boost::statechart::detail::reaction_result boost::statechart::simple_state<PG::RecoveryState::Reset, PG::RecoveryState::Recover yMachine, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_: :na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>: :local_react_impl_non_empty::local_react_impl<boost::mpl::list2<boost::statechart::custom_reaction<PG::FlushedEvt>, boost::statechar t::transition<boost::statechart::event_base, PG::RecoveryState::Crashed, boost::statechart::detail::no_context<boost::statechart::ev ent_base>, &boost::statechart::detail::no_context<boost::statechart::event_base>::no_function> >, boost::statechart::simple_state<PG ::RecoveryState::Reset, PG::RecoveryState::RecoveryMachine, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_: :na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::n a, mpl_::na>, (boost::statechart::history_mode)0> >(boost::statechart::simple_state<PG::RecoveryState::Reset, PG::RecoveryState::Rec overyMachine, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, m pl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode )0>&, boost::statechart::event_base const&, void const*)+0xb5) [0x7a1e65] -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html