On Mon, Nov 3, 2014 at 7:46 AM, Chad Seys <cwseys@xxxxxxxxxxxxxxxx> wrote: > Hi All, > I upgraded from emperor to firefly. Initial upgrade went smoothly and all > placement groups were active+clean . > Next I executed > 'ceph osd crush tunables optimal' > to upgrade CRUSH mapping. Okay...you know that's a data movement command, right? So you should expect it to impact operations. (Although not the crashes you're witnessing.) > Now I keep having OSDs go down or have requests blocked for long periods of > time. > I start back up the down OSDs and recovery eventually stops, but with 100s > of "incomplete" and "down+incomplete" pgs remaining. > The ceph web page says "If you see this state [incomplete], report a bug, > and try to start any failed OSDs that may contain the needed information." > Well, all the OSDs are up, though some have blocked requests. > > Also, the logs of the OSDs which go down have this message: > 2014-11-02 21:46:33.615829 7ffcf0421700 0 -- 192.168.164.192:6810/31314 >> > 192.168.164.186:6804/20934 pipe(0x2faa0280 sd=261 :6810 s=2 pgs=9 > 19 cs=25 l=0 c=0x2ed022c0).fault with nothing to send, going to standby > 2014-11-02 21:49:11.440142 7ffce4cf3700 0 -- 192.168.164.192:6810/31314 >> > 192.168.164.186:6804/20934 pipe(0xe512a00 sd=249 :6810 s=0 pgs=0 > cs=0 l=0 c=0x2a308b00).accept connect_seq 26 vs existing 25 state standby > 2014-11-02 21:51:20.085676 7ffcf6e3e700 -1 osd/PG.cc: In function > 'PG::RecoveryState::Crashed::Crashed(boost::statechart::state<PG::RecoveryS > tate::Crashed, PG::RecoveryState::RecoveryMachine>::my_context)' thread > 7ffcf6e3e700 time 2014-11-02 21:51:20.052242 > osd/PG.cc: 5424: FAILED assert(0 == "we got a bad state machine event") These failures are usually the result of adjusting tunables without having upgraded all the machines in the cluster — although they should also be fixed in v0.80.7. Are you still seeing crashes, or just the PG state issues? -Greg _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com