osd crash with object store set to newstore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Sage and all,

I build ceph code from wip-newstore on RHEL7 and running performance
tests to compare with filestore. After few hours of running the tests
the osd daemons started to crash. Here is the stack trace, the osd
crashes immediately after the restart. So I could not get the osd up
and running.

ceph version b8e22893f44979613738dfcdd40dada2b513118
(eb8e22893f44979613738dfcdd40dada2b513118)
1: /usr/bin/ceph-osd() [0xb84652]
2: (()+0xf130) [0x7f915f84f130]
3: (gsignal()+0x39) [0x7f915e2695c9]
4: (abort()+0x148) [0x7f915e26acd8]
5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f915eb6d9d5]
6: (()+0x5e946) [0x7f915eb6b946]
7: (()+0x5e973) [0x7f915eb6b973]
8: (()+0x5eb9f) [0x7f915eb6bb9f]
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x27a) [0xc84c5a]
10: (NewStore::collection_list_partial(coll_t, ghobject_t, int, int,
snapid_t, std::vector<ghobject_t, std::allocator<ghobject_t> >*,
ghobject_t*)+0x13c9) [0xa08639]
11: (PGBackend::objects_list_partial(hobject_t const&, int, int,
snapid_t, std::vector<hobject_t, std::allocator<hobject_t> >*,
hobject_t*)+0x352) [0x918a02]
12: (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+0x1066) [0x8aa906]
13: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>&)+0x1eb) [0x8cd06b]
14: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>&,
ThreadPool::TPHandle&)+0x68a) [0x85dbea]
15: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3ed)
[0x6c3f5d]
16: (OSD::ShardedOpWQ::_process(unsigned int,
ceph::heartbeat_handle_d*)+0x2e9) [0x6c4449]
17: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x86f) [0xc746bf]
18: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xc767f0]
19: (()+0x7df3) [0x7f915f847df3]
20: (clone()+0x6d) [0x7f915e32a01d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.

Please let me know the cause of this crash, when this crash happens I
noticed that two osds on separate machines are down. I can bring one
osd up but restarting the other osd causes both OSDs to crash. My
understanding is the crash seems to happen when two OSDs try to
communicate and replicate a particular PG.

Regards
Srikanth
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux