On 10/01/2018 4:24 PM, Sam Huracan wrote: > Hi Mike, > > Could you show system log at moment osd down and up? Ok so I have no idea how I missed this each time I looked but the syslog does show a problem. I've created the dump file mentioned in the log its 29M compressed so any one who wants it I'll have to more directly send it. Mike ------ Jan 10 15:56:31 pve ceph-osd[2722]: 2018-01-10 15:56:31.338068 7efe5eac1700 -1 abort: Corruption: block checksum mismatch Jan 10 15:56:31 pve ceph-osd[2722]: *** Caught signal (Aborted) ** Jan 10 15:56:31 pve ceph-osd[2722]: in thread 7efe5eac1700 thread_name:tp_osd_tp Jan 10 15:56:31 pve ceph-osd[2722]: ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable) Jan 10 15:56:31 pve ceph-osd[2722]: 1: (()+0xa16664) [0x55a8b396b664] Jan 10 15:56:31 pve ceph-osd[2722]: 2: (()+0x110c0) [0x7efe796b70c0] Jan 10 15:56:31 pve ceph-osd[2722]: 3: (gsignal()+0xcf) [0x7efe7867efcf] Jan 10 15:56:31 pve ceph-osd[2722]: 4: (abort()+0x16a) [0x7efe786803fa] Jan 10 15:56:31 pve ceph-osd[2722]: 5: (RocksDBStore::get(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, char const*, unsigned long, ceph::buffer::list*)+0x29f) [0x55a8b38a995f] Jan 10 15:56:31 pve ceph-osd[2722]: 6: (BlueStore::Collection::get_onode(ghobject_t const&, bool)+0x5ae) [0x55a8b382d2ae] Jan 10 15:56:31 pve ceph-osd[2722]: 7: (BlueStore::getattr(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ghobject_t const&, char const*, ceph::buffer::ptr&)+0xf6) [0x55a8b382e326] Jan 10 15:56:31 pve ceph-osd[2722]: 8: (PGBackend::objects_get_attr(hobject_t const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, ceph::buffer::list*)+0x106) [0x55a8b35bde26] Jan 10 15:56:31 pve ceph-osd[2722]: 9: (PrimaryLogPG::get_snapset_context(hobject_t const&, bool, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::list, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::list> > > const*, bool)+0x3fb) [0x55a8b35081db] Jan 10 15:56:31 pve ceph-osd[2722]: 10: (PrimaryLogPG::get_object_context(hobject_t const&, bool, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::list, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::list> > > const*)+0xc39) [0x55a8b352fec9] Jan 10 15:56:31 pve ceph-osd[2722]: 11: (PrimaryLogPG::find_object_context(hobject_t const&, std::shared_ptr<ObjectContext>*, bool, bool, hobject_t*)+0x387) [0x55a8b3533687] Jan 10 15:56:31 pve ceph-osd[2722]: 12: (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x2214) [0x55a8b3571694] Jan 10 15:56:31 pve ceph-osd[2722]: 13: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0xec6) [0x55a8b352c436] Jan 10 15:56:31 pve ceph-osd[2722]: 14: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3ab) [0x55a8b33a99eb] Jan 10 15:56:31 pve ceph-osd[2722]: 15: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x5a) [0x55a8b3647eba] Jan 10 15:56:31 pve ceph-osd[2722]: 16: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x103d) [0x55a8b33d0f4d] Jan 10 15:56:31 pve ceph-osd[2722]: 17: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x8ef) [0x55a8b39b806f] Jan 10 15:56:31 pve ceph-osd[2722]: 18: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55a8b39bb370] Jan 10 15:56:31 pve ceph-osd[2722]: 19: (()+0x7494) [0x7efe796ad494] Jan 10 15:56:31 pve ceph-osd[2722]: 20: (clone()+0x3f) [0x7efe78734aff] Jan 10 15:56:31 pve ceph-osd[2722]: 2018-01-10 15:56:31.343532 7efe5eac1700 -1 *** Caught signal (Aborted) ** Jan 10 15:56:31 pve ceph-osd[2722]: in thread 7efe5eac1700 thread_name:tp_osd_tp Jan 10 15:56:31 pve ceph-osd[2722]: ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable) Jan 10 15:56:31 pve ceph-osd[2722]: 1: (()+0xa16664) [0x55a8b396b664] Jan 10 15:56:31 pve ceph-osd[2722]: 2: (()+0x110c0) [0x7efe796b70c0] Jan 10 15:56:31 pve ceph-osd[2722]: 3: (gsignal()+0xcf) [0x7efe7867efcf] Jan 10 15:56:31 pve ceph-osd[2722]: 4: (abort()+0x16a) [0x7efe786803fa] Jan 10 15:56:31 pve ceph-osd[2722]: 5: (RocksDBStore::get(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, char const*, unsigned long, ceph::buffer::list*)+0x29f) [0x55a8b38a995f] Jan 10 15:56:31 pve ceph-osd[2722]: 6: (BlueStore::Collection::get_onode(ghobject_t const&, bool)+0x5ae) [0x55a8b382d2ae] Jan 10 15:56:31 pve ceph-osd[2722]: 7: (BlueStore::getattr(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ghobject_t const&, char const*, ceph::buffer::ptr&)+0xf6) [0x55a8b382e326] Jan 10 15:56:31 pve ceph-osd[2722]: 8: (PGBackend::objects_get_attr(hobject_t const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, ceph::buffer::list*)+0x106) [0x55a8b35bde26] Jan 10 15:56:31 pve ceph-osd[2722]: 9: (PrimaryLogPG::get_snapset_context(hobject_t const&, bool, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::list, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::list> > > const*, bool)+0x3fb) [0x55a8b35081db] Jan 10 15:56:31 pve ceph-osd[2722]: 10: (PrimaryLogPG::get_object_context(hobject_t const&, bool, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::list, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::list> > > const*)+0xc39) [0x55a8b352fec9] Jan 10 15:56:31 pve ceph-osd[2722]: 11: (PrimaryLogPG::find_object_context(hobject_t const&, std::shared_ptr<ObjectContext>*, bool, bool, hobject_t*)+0x387) [0x55a8b3533687] Jan 10 15:56:31 pve ceph-osd[2722]: 12: (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x2214) [0x55a8b3571694] Jan 10 15:56:31 pve ceph-osd[2722]: 13: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0xec6) [0x55a8b352c436] Jan 10 15:56:31 pve ceph-osd[2722]: 14: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3ab) [0x55a8b33a99eb] Jan 10 15:56:31 pve ceph-osd[2722]: 15: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x5a) [0x55a8b3647eba] Jan 10 15:56:31 pve ceph-osd[2722]: 16: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x103d) [0x55a8b33d0f4d] Jan 10 15:56:31 pve ceph-osd[2722]: 17: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x8ef) [0x55a8b39b806f] Jan 10 15:56:31 pve ceph-osd[2722]: 18: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55a8b39bb370] Jan 10 15:56:31 pve ceph-osd[2722]: 19: (()+0x7494) [0x7efe796ad494] Jan 10 15:56:31 pve ceph-osd[2722]: 20: (clone()+0x3f) [0x7efe78734aff] Jan 10 15:56:31 pve ceph-osd[2722]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. Jan 10 15:56:31 pve systemd[1]: ceph-osd@12.service: Main process exited, code=killed, status=6/ABRT Jan 10 15:56:31 pve systemd[1]: ceph-osd@12.service: Unit entered failed state. Jan 10 15:56:31 pve systemd[1]: ceph-osd@12.service: Failed with result 'signal'. Jan 10 15:56:31 pve kernel: [171262.263294] libceph: osd12 down Jan 10 15:56:51 pve systemd[1]: ceph-osd@12.service: Service hold-off time over, scheduling restart. Jan 10 15:56:51 pve systemd[1]: Stopped Ceph object storage daemon osd.12. Jan 10 15:56:51 pve systemd[1]: Starting Ceph object storage daemon osd.12... Jan 10 15:56:51 pve systemd[1]: Started Ceph object storage daemon osd.12. Jan 10 15:56:51 pve ceph-osd[26121]: starting osd.12 at - osd_data /var/lib/ceph/osd/ceph-12 /var/lib/ceph/osd/ceph-12/journal _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com