Seems to be a truncated log file... That usually indicates filesystem corruption. Anything in dmesg? -Sam On Thu, Nov 15, 2012 at 1:07 PM, Stefan Priebe <s.priebe@xxxxxxxxxxxx> wrote: > Hello list, > > actual master incl. upstream/wip-fd-simple-cache results in this crash when > i try to start some of my osds (others work fine) today on multiple nodes: > > -2> 2012-11-15 22:04:09.226945 7f3af1c7a780 0 osd.52 pg_epoch: 657 > pg[3.3b( v 632'823 (632'823,632'823] n=5 ec=17 les/c 18/18 656/656/17) [] > r=0 lpr=0 pi=17-655/2 (info mismatch, log(632'823,0'0]) (log bound mismatch, > empty) lcod 0'0 mlcod 0'0 inactive] Got exception 'read_log_error: read_log > got 0 bytes, expected 126086-0=126086' while reading log. Moving corrupted > log file to 'corrupt_log_2012-11-15_22:04_3.3b' for later analysis. > -1> 2012-11-15 22:04:09.233563 7f3af1c7a780 0 osd.52 pg_epoch: 657 > pg[3.557( v 632'753 (0'0,632'753] n=2 ec=17 les/c 18/18 656/656/17) [] r=0 > lpr=0 pi=17-655/2 (info mismatch, log(0'0,0'0]) lcod 0'0 mlcod 0'0 inactive] > Got exception 'read_log_error: read_log got 0 bytes, expected > 115488-0=115488' while reading log. Moving corrupted log file to > 'corrupt_log_2012-11-15_22:04_3.557' for later analysis. > 0> 2012-11-15 22:04:09.234536 7f3ae87d0700 -1 os/FileStore.cc: In > function 'int FileStore::_collection_add(coll_t, coll_t, const hobject_t&, > const SequencerPosition&)' thread 7f3ae87d0700 time 2012-11-15 > 22:04:09.233672 > os/FileStore.cc: 4500: FAILED assert(replaying) > > ceph version 0.54-607-gf89e101 (f89e1012bafabd6875a4a1e1832d76ffdf45b039) > 1: (FileStore::_collection_add(coll_t, coll_t, hobject_t const&, > SequencerPosition const&)+0x77d) [0x72ff0d] > 2: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, > int)+0x25fb) [0x73481b] > 3: (FileStore::do_transactions(std::list<ObjectStore::Transaction*, > std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c) > [0x73952c] > 4: (FileStore::_do_op(FileStore::OpSequencer*)+0x195) [0x705c45] > 5: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x830f1b] > 6: (ThreadPool::WorkThread::entry()+0x10) [0x833700] > 7: (()+0x68ca) [0x7f3af16578ca] > 8: (clone()+0x6d) [0x7f3aefac6bfd] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to > interpret this. > > --- logging levels --- > 0/ 5 none > 0/ 0 lockdep > 0/ 0 context > 0/ 0 crush > 1/ 5 mds > 1/ 5 mds_balancer > 1/ 5 mds_locker > 1/ 5 mds_log > 1/ 5 mds_log_expire > 1/ 5 mds_migrator > 0/ 0 buffer > 0/ 0 timer > 0/ 1 filer > 0/ 1 striper > 0/ 1 objecter > 0/ 5 rados > 0/ 5 rbd > 0/ 0 journaler > 0/ 5 objectcacher > 0/ 5 client > 0/ 0 osd > 0/ 0 optracker > 0/ 0 objclass > 0/ 0 filestore > 0/ 0 journal > 0/ 0 ms > 1/ 5 mon > 0/ 0 monc > 0/ 5 paxos > 0/ 0 tp > 0/ 0 auth > 1/ 5 crypto > 0/ 0 finisher > 0/ 0 heartbeatmap > 0/ 0 perfcounter > 1/ 5 rgw > 1/ 5 hadoop > 1/ 5 javaclient > 0/ 0 asok > 0/ 0 throttle > -2/-2 (syslog threshold) > -1/-1 (stderr threshold) > max_recent 10000 > max_new 1000000 > log_file /var/log/ceph/ceph-osd.52.log > --- end dump of recent events --- > 2012-11-15 22:04:09.235734 7f3ae87d0700 -1 *** Caught signal (Aborted) ** > in thread 7f3ae87d0700 > > ceph version 0.54-607-gf89e101 (f89e1012bafabd6875a4a1e1832d76ffdf45b039) > 1: /usr/bin/ceph-osd() [0x799769] > 2: (()+0xeff0) [0x7f3af165fff0] > 3: (gsignal()+0x35) [0x7f3aefa29215] > 4: (abort()+0x180) [0x7f3aefa2c020] > 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f3af02bddc5] > 6: (()+0xcb166) [0x7f3af02bc166] > 7: (()+0xcb193) [0x7f3af02bc193] > 8: (()+0xcb28e) [0x7f3af02bc28e] > 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x7c9) [0x7fd069] > 10: (FileStore::_collection_add(coll_t, coll_t, hobject_t const&, > SequencerPosition const&)+0x77d) [0x72ff0d] > 11: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, > int)+0x25fb) [0x73481b] > 12: (FileStore::do_transactions(std::list<ObjectStore::Transaction*, > std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c) > [0x73952c] > 13: (FileStore::_do_op(FileStore::OpSequencer*)+0x195) [0x705c45] > 14: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x830f1b] > 15: (ThreadPool::WorkThread::entry()+0x10) [0x833700] > 16: (()+0x68ca) [0x7f3af16578ca] > 17: (clone()+0x6d) [0x7f3aefac6bfd] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to > interpret this. > > --- begin dump of recent events --- > 0> 2012-11-15 22:04:09.235734 7f3ae87d0700 -1 *** Caught signal > (Aborted) ** > in thread 7f3ae87d0700 > > ceph version 0.54-607-gf89e101 (f89e1012bafabd6875a4a1e1832d76ffdf45b039) > 1: /usr/bin/ceph-osd() [0x799769] > 2: (()+0xeff0) [0x7f3af165fff0] > 3: (gsignal()+0x35) [0x7f3aefa29215] > 4: (abort()+0x180) [0x7f3aefa2c020] > 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f3af02bddc5] > 6: (()+0xcb166) [0x7f3af02bc166] > 7: (()+0xcb193) [0x7f3af02bc193] > 8: (()+0xcb28e) [0x7f3af02bc28e] > 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x7c9) [0x7fd069] > 10: (FileStore::_collection_add(coll_t, coll_t, hobject_t const&, > SequencerPosition const&)+0x77d) [0x72ff0d] > 11: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, > int)+0x25fb) [0x73481b] > 12: (FileStore::do_transactions(std::list<ObjectStore::Transaction*, > std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x4c) > [0x73952c] > 13: (FileStore::_do_op(FileStore::OpSequencer*)+0x195) [0x705c45] > 14: (ThreadPool::worker(ThreadPool::WorkThread*)+0x82b) [0x830f1b] > 15: (ThreadPool::WorkThread::entry()+0x10) [0x833700] > 16: (()+0x68ca) [0x7f3af16578ca] > 17: (clone()+0x6d) [0x7f3aefac6bfd] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to > interpret this. > > --- logging levels --- > 0/ 5 none > 0/ 0 lockdep > 0/ 0 context > 0/ 0 crush > 1/ 5 mds > 1/ 5 mds_balancer > 1/ 5 mds_locker > 1/ 5 mds_log > 1/ 5 mds_log_expire > 1/ 5 mds_migrator > 0/ 0 buffer > 0/ 0 timer > 0/ 1 filer > 0/ 1 striper > 0/ 1 objecter > 0/ 5 rados > 0/ 5 rbd > 0/ 0 journaler > 0/ 5 objectcacher > 0/ 5 client > 0/ 0 osd > 0/ 0 optracker > 0/ 0 objclass > 0/ 0 filestore > 0/ 0 journal > 0/ 0 ms > 1/ 5 mon > 0/ 0 monc > 0/ 5 paxos > 0/ 0 tp > 0/ 0 auth > 1/ 5 crypto > 0/ 0 finisher > 0/ 0 heartbeatmap > 0/ 0 perfcounter > 1/ 5 rgw > 1/ 5 hadoop > 1/ 5 javaclient > 0/ 0 asok > 0/ 0 throttle > -2/-2 (syslog threshold) > -1/-1 (stderr threshold) > max_recent 10000 > max_new 1000000 > log_file /var/log/ceph/ceph-osd.52.log > --- end dump of recent events --- > > Stefan > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html