Re: almost all of the Ceph cluster OSD are assert when create pg for luminous(12.0.3)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 20, 2017 at 2:17 AM, Wangwenfeng <wang.wenfeng@xxxxxxx> wrote:
> Hi, all
> I test the luminous(12.0.3), almost all of the Ceph cluster OSD are assert when create pg. The assert is
>
> 2017-06-19 19:52:43.030110 7f4d0a589700 10 osd.1 622 build_initial_pg_history 3.1f3 created 377
> 2017-06-19 19:52:43.063891 7f4d0a589700 -1 /root/mnt/ceph_tmp/release/Ubuntu/WORKDIR/ceph-12.0.3/src/osd/OSDMap.h: In function 'const epoch_t& OSDMap::get_up_from(int) const' thread 7f4d0a589700 time 2017-06-19 19:52:43.060805
> /root/mnt/ceph_tmp/release/Ubuntu/WORKDIR/ceph-12.0.3/src/osd/OSDMap.h: 556: FAILED assert(exists(osd))
>
>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x10e) [0x55f0c9293d3e]
>  2: (()+0x3c9cb0) [0x55f0c8c99cb0]
>  3: (PastIntervals::check_new_interval(int, int, std::vector<int, std::allocator<int> > const&, std::vector<int, std::allocator<int> > const&, int, int, std::vector<int, std::allocator<int> > const&, std::vector<int, std::allocator<int> > const&, unsigned int, unsigned int, std::shared_ptr<OSDMap const>, std::shared_ptr<OSDMap const>, pg_t, IsPGRecoverablePredicate*, PastIntervals*, std::ostream*)+0x5cd) [0x55f0c8f8826d]
>  4: (OSD::build_initial_pg_history(spg_t, unsigned int, utime_t, pg_history_t*, PastIntervals*)+0x594) [0x55f0c8d84ce4]
>  5: (OSD::handle_pg_create(boost::intrusive_ptr<OpRequest>)+0x97c) [0x55f0c8d8f5cc]
>  6: (OSD::dispatch_op(boost::intrusive_ptr<OpRequest>)+0x1b1) [0x55f0c8d91641]
>  7: (OSD::do_waiters()+0x9d) [0x55f0c8d917cd]
>  8: (OSD::ms_dispatch(Message*)+0x69) [0x55f0c8d923f9]
>  9: (DispatchQueue::entry()+0x79b) [0x55f0c945faeb]
>  10: (DispatchQueue::DispatchThread::entry()+0xd) [0x55f0c930cafd]
>  11: (()+0x8182) [0x7f4d16bb0182]
>  12: (clone()+0x6d) [0x7f4d15ca047d]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
>
> Does anyone have this problem? Has the probem been solved?

Nobody else has reported this and it doesn't seem to have turned up in
our nightlies. Can you describe a little more about what happens? If
you got a core dump, can you find the value of old_acting_primary?
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux