Hello, sorry to disturb , but recently when I use ceph(12.2.8),I found that the
leader monitor will always failed in thread_name:safe_timer. Here is a part of the log 0> 2018-11-20 10:33:22.386543 7faf7d84f700 -1 *** Caught signal (Aborted) ** in thread 7faf7d84f700 thread_name:safe_timer ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) luminous (stable) 1: (()+0x93f2d1) [0x55ef7319c2d1] 2: (()+0xf5e0) [0x7faf83fb55e0] 3: (gsignal()+0x37) [0x7faf810ee1f7] 4: (abort()+0x148) [0x7faf810ef8e8] 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7faf819f4ac5] 6: (()+0x5ea36) [0x7faf819f2a36] 7: (()+0x5ea63) [0x7faf819f2a63] 8: (()+0x5ec83) [0x7faf819f2c83] 9: (std::__throw_out_of_range(char const*)+0x77) [0x7faf81a47a97] 10: (FSMap::get_info_gid(mds_gid_t) const+0xfc) [0x55ef72e1dc0c] 11: (MDSMonitor::tick()+0x427) [0x55ef72e107d7] 12: (Monitor::tick()+0x128) [0x55ef72c48908] 13: (C_MonContext::finish(int)+0x37) [0x55ef72c1a7d7] 14: (Context::complete(int)+0x9) [0x55ef72c585c9] 15: (SafeTimer::timer_thread()+0x104) [0x55ef72e8dbc4] 16: (SafeTimerThread::entry()+0xd) [0x55ef72e8f5ed] 17: (()+0x7e25) [0x7faf83fade25] 18: (clone()+0x6d) [0x7faf811b134d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. And my cluster’s status is about: cluster: id: 8c9bc910-c7f1-4b98-8c61-e18ee786e983 health: HEALTH_OK services: mon: 2 daemons, quorum qbs-monitor-online010-hbaz1.qiyi.virtual,qbs-monitor-online009-hbaz1.qiyi.virtual mgr: qbs-monitor-online009-hbaz1(active, starting) osd: 164 osds: 164 up, 164 in rgw: 3 daemons active data: pools: 26 pools, 4832 pgs objects: 5.39k objects, 20.0GiB usage: 243GiB used, 1.07PiB / 1.07PiB avail pgs: 4832 active+clean io: client: 4.63KiB/s wr, 0op/s rd, 0op/s wr what can I do to recover it ? I am happy to give more information about the question if necessary. Sincerely, LouKaiyi |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com