osd dump is like this: pool 0 'data' rep size 2 crush_ruleset 0 object_hash rjenkins pg_num 768 pgp_num 768 lpg_num 2 lpgp_num 2 last_change 1 owner 0 crash_replay_interval 45 pool 1 'metadata' rep size 2 crush_ruleset 1 object_hash rjenkins pg_num 768 pgp_num 768 lpg_num 2 lpgp_num 2 last_change 1 owner 0 pool 2 'rbd' rep size 2 crush_ruleset 2 object_hash rjenkins pg_num 768 pgp_num 768 lpg_num 2 lpgp_num 2 last_change 1 owner 0 pool 9 'nova' rep size 2 crush_ruleset 0 object_hash rjenkins pg_num 2568 pgp_num 2568 lpg_num 0 lpgp_num 0 last_change 1435 owner 18446744073709551615 removed_snaps [1~1] pool 10 'glance' rep size 2 crush_ruleset 0 object_hash rjenkins pg_num 2568 pgp_num 2568 lpg_num 0 lpgp_num 0 last_change 132 owner 18446744073709551615 On Wed, Apr 25, 2012 at 11:04 AM, Tomasz Paszkowski <ss7pro@xxxxxxxxx> wrote: > Hi, > > After making and removing snapshot from one of the pools, all of the > osd in cluster are dying with log like below: > > > 2012-04-25 11:01:00.938313 7f66694b9700 osd.1 1434 removing old > osdmap epoch 966 > 2012-04-25 11:01:00.938330 7f66694b9700 osd.1 1434 removing old > osdmap epoch 967 > 2012-04-25 11:01:00.938348 7f66694b9700 osd.1 1434 advance to epoch > 1435 (<= newest 1470) > 2012-04-25 11:01:00.939437 7f66694b9700 osd.1 1435 advance_map epoch > 1435 1325 pgs > 2012-04-25 11:01:00.939455 7f66694b9700 osd.1 1435 pool 0 removed > snaps [], unchanged (snap_epoch = 0) > 2012-04-25 11:01:00.939469 7f66694b9700 osd.1 1435 pool 1 removed > snaps [], unchanged (snap_epoch = 0) > 2012-04-25 11:01:00.939482 7f66694b9700 osd.1 1435 pool 2 removed > snaps [], unchanged (snap_epoch = 0) > ./include/interval_set.h: In function 'void interval_set<T>::erase(T, > T) [with T = snapid_t]' thread 7f66694b9700 time 2012-04-25 > 11:01:00.939509 > ./include/interval_set.h: 382: FAILED assert(_size >= 0) > ceph version 0.44.1 (commit:c89b7f22c8599eb974e75a2f7a5f855358199dee) > 1: (OSD::advance_map(ObjectStore::Transaction&, C_Contexts*)+0x2971) [0x5cfb51] > 2: (OSD::handle_osd_map(MOSDMap*)+0x193c) [0x5d162c] > 3: (OSD::_dispatch(Message*)+0x2eb) [0x5d34fb] > 4: (OSD::ms_dispatch(Message*)+0x129) [0x5d3a59] > 5: (SimpleMessenger::dispatch_entry()+0x78b) [0x67513b] > 6: (SimpleMessenger::DispatchThread::entry()+0xd) [0x52124d] > 7: (()+0x7e9a) [0x7f6676226e9a] > 8: (clone()+0x6d) [0x7f66747db4bd] > ceph version 0.44.1 (commit:c89b7f22c8599eb974e75a2f7a5f855358199dee) > 1: (OSD::advance_map(ObjectStore::Transaction&, C_Contexts*)+0x2971) [0x5cfb51] > 2: (OSD::handle_osd_map(MOSDMap*)+0x193c) [0x5d162c] > 3: (OSD::_dispatch(Message*)+0x2eb) [0x5d34fb] > 4: (OSD::ms_dispatch(Message*)+0x129) [0x5d3a59] > 5: (SimpleMessenger::dispatch_entry()+0x78b) [0x67513b] > 6: (SimpleMessenger::DispatchThread::entry()+0xd) [0x52124d] > 7: (()+0x7e9a) [0x7f6676226e9a] > 8: (clone()+0x6d) [0x7f66747db4bd] > *** Caught signal (Aborted) ** > in thread 7f66694b9700 > ceph version 0.44.1 (commit:c89b7f22c8599eb974e75a2f7a5f855358199dee) > 1: /usr/bin/ceph-osd() [0x6fa0c6] > 2: (()+0xfcb0) [0x7f667622ecb0] > 3: (gsignal()+0x35) [0x7f667471f445] > 4: (abort()+0x17b) [0x7f6674722bab] > 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f667506d69d] > 6: (()+0xb5846) [0x7f667506b846] > 7: (()+0xb5873) [0x7f667506b873] > 8: (()+0xb596e) [0x7f667506b96e] > 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x200) [0x68f420] > 10: (OSD::advance_map(ObjectStore::Transaction&, C_Contexts*)+0x2971) > [0x5cfb51] > 11: (OSD::handle_osd_map(MOSDMap*)+0x193c) [0x5d162c] > 12: (OSD::_dispatch(Message*)+0x2eb) [0x5d34fb] > 13: (OSD::ms_dispatch(Message*)+0x129) [0x5d3a59] > 14: (SimpleMessenger::dispatch_entry()+0x78b) [0x67513b] > 15: (SimpleMessenger::DispatchThread::entry()+0xd) [0x52124d] > 16: (()+0x7e9a) [0x7f6676226e9a] > 17: (clone()+0x6d) [0x7f66747db4bd] > > > -- > Tomasz Paszkowski > SS7, Asterisk, SAN, Datacenter, Cloud Computing > +48500166299 -- Tomasz Paszkowski SS7, Asterisk, SAN, Datacenter, Cloud Computing +48500166299 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html