On Thu, Feb 20, 2014 at 4:26 AM, Michael <michael@xxxxxxxxxxxxxxxxxx> wrote: > Hi All, > > Have a log full of - > > "log [ERR] : 1.9 log bound mismatch, info (46784'1236417,46797'1239418] > actual [46784'1235968,46797'1239418]" Do you mean that error message is showing up for a lot of different PGs? The specific error indicates that the PG log doesn't look quite as expected, but in this case it's got more entries than it should, which should be recoverable. If that's the case for a lot of PGs, though, it sounds like maybe there was an issue with LevelDB and it resurrected a lot of deleted data which has left the store in an inconsistent state. The particular assert you're hitting supports that; an iterator is becoming invalid when it shouldn't be. If the other OSDs are fine, I'd mark this OSD down and out, reformat the drive, and let the cluster recover. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com > > "192.168.7.177:6800/15655 >> 192.168.7.183:6802/3348 pipe(0x20e4f00 sd=65 > :56394 s=2 pgs=24194 cs=1 l=0 c=0x19668f20).fault, initiating reconnect" > > and an OSD that showed as down, started it up and data synced as expected > but then the osd started crashing and rebooting on cycle. > > Log can be obtained from http://onlinefusion.co.uk/info/ceph-osd.4.zip > (Trimmed the repeating parts so it's 160KB), snippet below. > Any ideas what's wrong with it? > > -------------- > > -1> 2014-02-20 11:54:55.703196 7fbca1278700 0 log [ERR] : 1.9 log bound > mismatch, info (46784'1236417,46797'1239418] actual > [46784'1235968,46797'1239418] > 0> 2014-02-20 11:55:05.243723 7fbc9f274700 -1 os/DBObjectMap.cc: In > function 'virtual bool DBObjectMap::DBObjectMapIteratorImpl::valid()' thread > 7fbc9f274700 time 2014-02-20 11:55:05.240689 > os/DBObjectMap.cc: 400: FAILED assert(!valid || cur_iter->valid()) > > ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60) > 1: /usr/bin/ceph-osd() [0x95b762] > 2: (PG::_scan_list(ScrubMap&, std::vector<hobject_t, > std::allocator<hobject_t> >&, bool, ThreadPool::TPHandle&)+0xed7) [0x86e3b7] > 3: (PG::build_scrub_map_chunk(ScrubMap&, hobject_t, hobject_t, bool, > ThreadPool::TPHandle&)+0x106) [0x871256] > 4: (PG::replica_scrub(MOSDRepScrub*, ThreadPool::TPHandle&)+0x88e) > [0x871f1e] > 5: (OSD::RepScrubWQ::_process(MOSDRepScrub*, ThreadPool::TPHandle&)+0xbd) > [0x740b8d] > 6: (ThreadPool::worker(ThreadPool::WorkThread*)+0x68c) [0xa33d9c] > 7: (ThreadPool::WorkThread::entry()+0x10) [0xa34ff0] > 8: (()+0x7f8e) [0x7fbcbe3c6f8e] > 9: (clone()+0x6d) [0x7fbcbc8e5a0d] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to > interpret this. > > --- logging levels --- > 0/ 5 none > 0/ 0 lockdep > 0/ 0 context > 0/ 0 crush > 1/ 5 mds > 1/ 5 mds_balancer > 1/ 5 mds_locker > 1/ 5 mds_log > 1/ 5 mds_log_expire > 1/ 5 mds_migrator > 0/ 0 buffer > 0/ 0 timer > 0/ 1 filer > 0/ 1 striper > 0/ 1 objecter > 0/ 5 rados > 0/ 5 rbd > 0/ 0 journaler > 0/ 5 objectcacher > 0/ 5 client > 0/ 0 osd > 0/ 0 optracker > 0/ 0 objclass > 0/ 0 filestore > 0/ 0 journal > 0/ 0 ms > 1/ 5 mon > 0/ 0 monc > 1/ 5 paxos > 0/ 0 tp > 0/ 0 auth > 1/ 5 crypto > 0/ 0 finisher > 0/ 0 heartbeatmap > 0/ 0 perfcounter > 1/ 5 rgw > 1/ 5 javaclient > 0/ 0 asok > 0/ 0 throttle > -2/-2 (syslog threshold) > -1/-1 (stderr threshold) > max_recent 10000 > max_new 1000 > log_file /var/log/ceph/ceph-osd.4.log > --- end dump of recent events --- > 2014-02-20 11:55:05.343940 7fbc9f274700 -1 *** Caught signal (Aborted) ** > in thread 7fbc9f274700 > > ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60) > 1: /usr/bin/ceph-osd() [0x97dc70] > 2: (()+0xfbd0) [0x7fbcbe3cebd0] > 3: (gsignal()+0x37) [0x7fbcbc822037] > 4: (abort()+0x148) [0x7fbcbc825698] > 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7fbcbd12fe8d] > 6: (()+0x5ef76) [0x7fbcbd12df76] > 7: (()+0x5efa3) [0x7fbcbd12dfa3] > 8: (()+0x5f1de) [0x7fbcbd12e1de] > 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x43d) [0xa3f64d] > 10: /usr/bin/ceph-osd() [0x95b762] > 11: (PG::_scan_list(ScrubMap&, std::vector<hobject_t, > std::allocator<hobject_t> >&, bool, ThreadPool::TPHandle&)+0xed7) [0x86e3b7] > 12: (PG::build_scrub_map_chunk(ScrubMap&, hobject_t, hobject_t, bool, > ThreadPool::TPHandle&)+0x106) [0x871256] > 13: (PG::replica_scrub(MOSDRepScrub*, ThreadPool::TPHandle&)+0x88e) > [0x871f1e] > 14: (OSD::RepScrubWQ::_process(MOSDRepScrub*, ThreadPool::TPHandle&)+0xbd) > [0x740b8d] > 15: (ThreadPool::worker(ThreadPool::WorkThread*)+0x68c) [0xa33d9c] > 16: (ThreadPool::WorkThread::entry()+0x10) [0xa34ff0] > 17: (()+0x7f8e) [0x7fbcbe3c6f8e] > 18: (clone()+0x6d) [0x7fbcbc8e5a0d] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to > interpret this. > > --- begin dump of recent events --- > 0> 2014-02-20 11:55:05.343940 7fbc9f274700 -1 *** Caught signal > (Aborted) ** > in thread 7fbc9f274700 > > ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60) > 1: /usr/bin/ceph-osd() [0x97dc70] > 2: (()+0xfbd0) [0x7fbcbe3cebd0] > 3: (gsignal()+0x37) [0x7fbcbc822037] > 4: (abort()+0x148) [0x7fbcbc825698] > 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7fbcbd12fe8d] > 6: (()+0x5ef76) [0x7fbcbd12df76] > 7: (()+0x5efa3) [0x7fbcbd12dfa3] > 8: (()+0x5f1de) [0x7fbcbd12e1de] > 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x43d) [0xa3f64d] > 10: /usr/bin/ceph-osd() [0x95b762] > 11: (PG::_scan_list(ScrubMap&, std::vector<hobject_t, > std::allocator<hobject_t> >&, bool, ThreadPool::TPHandle&)+0xed7) [0x86e3b7] > 12: (PG::build_scrub_map_chunk(ScrubMap&, hobject_t, hobject_t, bool, > ThreadPool::TPHandle&)+0x106) [0x871256] > 13: (PG::replica_scrub(MOSDRepScrub*, ThreadPool::TPHandle&)+0x88e) > [0x871f1e] > 14: (OSD::RepScrubWQ::_process(MOSDRepScrub*, ThreadPool::TPHandle&)+0xbd) > [0x740b8d] > 15: (ThreadPool::worker(ThreadPool::WorkThread*)+0x68c) [0xa33d9c] > 16: (ThreadPool::WorkThread::entry()+0x10) [0xa34ff0] > 17: (()+0x7f8e) [0x7fbcbe3c6f8e] > 18: (clone()+0x6d) [0x7fbcbc8e5a0d] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to > interpret this. > > -------------- > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com