Assert failed in PG.cc:5235 ("we got a bad state machine event")

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi list,
   I have got the following log when I running test on top of Ceph. Seems this part of codes are quite fresh(not yet appear in 0.56.1), any idea about what happen?


pgs=714 cs=11 l=0).reader got old message 1 <= 6 0x4552800 osd_map(363..375 src has 1..375) v3, discarding
2013-01-25 14:12:53.979018 7f192cd96700  0 -- 192.101.11.124:6804/10783 >> 192.101.11.121:6849/17270 pipe(0xe7ad400 sd=245 :6804 s=2
pgs=486 cs=11 l=0).fault with nothing to send, going to standby
2013-01-25 14:12:54.987534 7f192bf88700  0 -- 192.101.11.124:6804/10783 >> 192.101.11.125:6804/80050 pipe(0x3a12000 sd=48 :41394 s=2
pgs=4275 cs=3 l=0).fault, initiating reconnect
2013-01-25 14:12:54.989279 7f1947859700 -1 osd.510 369 heartbeat_check: no reply from osd.37 since 2013-01-25 14:12:17.748513 (cutof
f 2013-01-25 14:12:34.989276)
2013-01-25 14:12:54.989428 7f192bf88700  0 -- 192.101.11.124:6804/10783 >> 192.101.11.125:6804/80050 pipe(0x3a12000 sd=47 :41473 s=2
pgs=4281 cs=5 l=0).reader got old message 1 <= 2 0x740e800 osd_map(363..377 src has 1..377) v3, discarding
2013-01-25 14:12:54.997166 7f192945d700  0 -- 192.101.11.124:6804/10783 >> 192.101.11.121:6872/16641 pipe(0x1b01180 sd=27 :38664 s=2
pgs=714 cs=11 l=0).fault, initiating reconnect
2013-01-25 14:12:54.997212 7f192bf88700  0 -- 192.101.11.124:6804/10783 >> 192.101.11.125:6804/80050 pipe(0x3a12000 sd=47 :41473 s=2
pgs=4281 cs=5 l=0).fault, initiating reconnect
2013-01-25 14:12:54.998554 7f192945d700  0 -- 192.101.11.124:6804/10783 >> 192.101.11.121:6872/16641 pipe(0x1b01180 sd=27 :38743 s=2
pgs=715 cs=13 l=0).reader got old message 1 <= 6 0x4850c00 osd_map(370..377 src has 1..377) v3, discarding
2013-01-25 14:12:54.998565 7f192bf88700  0 -- 192.101.11.124:6804/10783 >> 192.101.11.125:6804/80050 pipe(0x3a12000 sd=47 :41477 s=2
pgs=4282 cs=7 l=0).reader got old message 1 <= 2 0x8d4da00 osd_map(370..377 src has 1..377) v3, discarding
2013-01-25 14:12:55.202858 7f193f849700  0 log [WRN] : map e377 wrongly marked me down
2013-01-25 14:12:58.985680 7f1929d66700  0 -- 192.101.11.124:6874/10783 >> 192.101.11.120:6831/1580 pipe(0x390cc80 sd=38 :60934 s=2
pgs=535 cs=1 l=0).fault with nothing to send, going to standby
2013-01-25 14:12:59.030720 7f1935b23700  0 -- 192.101.11.124:6874/10783 >> 192.101.11.121:6872/16641 pipe(0x7a7b680 sd=58 :38862 s=2
pgs=725 cs=1 l=0).fault, initiating reconnect
2013-01-25 14:12:59.031210 7f1935a22700  0 -- 192.101.11.124:6874/10783 >> 192.101.11.125:6804/80050 pipe(0x1b95180 sd=59 :41596 s=2
pgs=4285 cs=1 l=0).fault, initiating reconnect
2013-01-25 14:12:59.032473 7f1935b23700  0 -- 192.101.11.124:6874/10783 >> 192.101.11.121:6872/16641 pipe(0x7a7b680 sd=58 :38864 s=2
pgs=726 cs=3 l=0).fault with nothing to send, going to standby
2013-01-25 14:12:59.032721 7f1935a22700  0 -- 192.101.11.124:6874/10783 >> 192.101.11.125:6804/80050 pipe(0x1b95180 sd=59 :41598 s=2
pgs=4286 cs=3 l=0).fault with nothing to send, going to standby
2013-01-25 14:12:59.131044 7f193b040700 -1 osd/PG.cc: In function 'PG::RecoveryState::Crashed::Crashed(boost::statechart::state<PG::
RecoveryState::Crashed, PG::RecoveryState::RecoveryMachine>::my_context)' thread 7f193b040700 time 2013-01-25 14:12:59.030259
osd/PG.cc: 5235: FAILED assert(0 == "we got a bad state machine event")

ceph version 0.56-417-g67c7757 (67c77577bdfe4985aa50e91986677c742b7cc85f)
1: (PG::RecoveryState::Crashed::Crashed(boost::statechart::state<PG::RecoveryState::Crashed, PG::RecoveryState::RecoveryMachine, bo
ost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::n
a, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::my_context)
+0xc4) [0x740c84]
2: /usr/bin/ceph-osd() [0x77f189]
3: (boost::statechart::detail::reaction_result boost::statechart::simple_state<PG::RecoveryState::Reset, PG::RecoveryState::Recover
yMachine, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_:
:na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>:
:local_react_impl_non_empty::local_react_impl<boost::mpl::list2<boost::statechart::custom_reaction<PG::FlushedEvt>, boost::statechar
t::transition<boost::statechart::event_base, PG::RecoveryState::Crashed, boost::statechart::detail::no_context<boost::statechart::ev
ent_base>, &boost::statechart::detail::no_context<boost::statechart::event_base>::no_function> >, boost::statechart::simple_state<PG
::RecoveryState::Reset, PG::RecoveryState::RecoveryMachine, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_:
:na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::n
a, mpl_::na>, (boost::statechart::history_mode)0> >(boost::statechart::simple_state<PG::RecoveryState::Reset, PG::RecoveryState::Rec
overyMachine, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, m
pl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode
)0>&, boost::statechart::event_base const&, void const*)+0xb5) [0x7a1e65]
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux