ceph monitor keep crash

Jianyu Li <easyljy@xxxxxxxxx> · Tue, 4 Jun 2019 13:01:08 -0500

Hello,
I have a ceph cluster running over 2 years and the monitor began crash since yesterday. I had some flapping OSDs up and down occasionally, sometimes I need to rebuild the OSD. I found 3 OSDs are down yesterday, they may cause this issue or may not. 

Ceph Version: 12.2.12, ( upgraded from 12.2.8 not fix the issue)
I have 5 mon nodes, when I start mon service on the first 2 nodes, they are good. Once I start the service on the third node, All 3 nodes begin keeping up/down(flapping) due to Aborted in OSDMonitor::build_incremental. I also tried to recover monitor from 1 node(remove other 4 nodes) by injecting monmap, the node keep crash as well. 

See below crash log from mon
May 31 02:26:09 ctlr101 systemd[1]: Started Ceph cluster monitor daemon.
May 31 02:26:09 ctlr101 ceph-mon[2632098]: 2019-05-31 02:26:09.345533 7fe250321080 -1 compacting monitor store ...
May 31 02:26:11 ctlr101 ceph-mon[2632098]: 2019-05-31 02:26:11.320926 7fe250321080 -1 done compacting
May 31 02:26:16 ctlr101 ceph-mon[2632098]: 2019-05-31 02:26:16.497933 7fe242925700 -1 log_channel(cluster) log [ERR] : overall HEALTH_ERR 13 osds down; 1 host (6 osds) down; 74266/2566020 objects misplace
May 31 02:26:16 ctlr101 ceph-mon[2632098]: *** Caught signal (Aborted) **
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  in thread 7fe24692d700 thread_name:ms_dispatch
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable)
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  1: (()+0x9e6334) [0x558c5f2fb334]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  2: (()+0x11390) [0x7fe24f6ce390]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  3: (gsignal()+0x38) [0x7fe24dc14428]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  4: (abort()+0x16a) [0x7fe24dc1602a]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  5: (OSDMonitor::build_incremental(unsigned int, unsigned int, unsigned long)+0x9c5) [0x558c5ee80455]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  6: (OSDMonitor::send_incremental(unsigned int, MonSession*, bool, boost::intrusive_ptr<MonOpRequest>)+0xcf) [0x558c5ee80b3f]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  7: (OSDMonitor::check_osdmap_sub(Subscription*)+0x22d) [0x558c5ee8622d]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  8: (Monitor::handle_subscribe(boost::intrusive_ptr<MonOpRequest>)+0x1082) [0x558c5ecdb0b2]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  9: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0x9f4) [0x558c5ed05114]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  10: (Monitor::_ms_dispatch(Message*)+0x6db) [0x558c5ed061ab]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  11: (Monitor::ms_dispatch(Message*)+0x23) [0x558c5ed372c3]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  12: (DispatchQueue::entry()+0xf4a) [0x558c5f2a205a]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  13: (DispatchQueue::DispatchThread::entry()+0xd) [0x558c5f035dcd]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  14: (()+0x76ba) [0x7fe24f6c46ba]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  15: (clone()+0x6d) [0x7fe24dce641d]
May 31 02:26:16 ctlr101 ceph-mon[2632098]: 2019-05-31 02:26:16.578932 7fe24692d700 -1 *** Caught signal (Aborted) **
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  in thread 7fe24692d700 thread_name:ms_dispatch
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable)
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  1: (()+0x9e6334) [0x558c5f2fb334]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  2: (()+0x11390) [0x7fe24f6ce390]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  3: (gsignal()+0x38) [0x7fe24dc14428]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  4: (abort()+0x16a) [0x7fe24dc1602a]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  5: (OSDMonitor::build_incremental(unsigned int, unsigned int, unsigned long)+0x9c5) [0x558c5ee80455]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  6: (OSDMonitor::send_incremental(unsigned int, MonSession*, bool, boost::intrusive_ptr<MonOpRequest>)+0xcf) [0x558c5ee80b3f]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  7: (OSDMonitor::check_osdmap_sub(Subscription*)+0x22d) [0x558c5ee8622d]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  8: (Monitor::handle_subscribe(boost::intrusive_ptr<MonOpRequest>)+0x1082) [0x558c5ecdb0b2]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  9: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0x9f4) [0x558c5ed05114]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  10: (Monitor::_ms_dispatch(Message*)+0x6db) [0x558c5ed061ab]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  11: (Monitor::ms_dispatch(Message*)+0x23) [0x558c5ed372c3]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  12: (DispatchQueue::entry()+0xf4a) [0x558c5f2a205a]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  13: (DispatchQueue::DispatchThread::entry()+0xd) [0x558c5f035dcd]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  14: (()+0x76ba) [0x7fe24f6c46ba]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  15: (clone()+0x6d) [0x7fe24dce641d]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  -1501> 2019-05-31 02:26:09.345533 7fe250321080 -1 compacting monitor store ...
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  -1475> 2019-05-31 02:26:11.320926 7fe250321080 -1 done compacting
May 31 02:26:16 ctlr101 ceph-mon[2632098]:   -946> 2019-05-31 02:26:16.497933 7fe242925700 -1 log_channel(cluster) log [ERR] : overall HEALTH_ERR 13 osds down; 1 host (6 osds) down; 74266/2566020 objects
May 31 02:26:16 ctlr101 ceph-mon[2632098]:      0> 2019-05-31 02:26:16.578932 7fe24692d700 -1 *** Caught signal (Aborted) **
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  in thread 7fe24692d700 thread_name:ms_dispatch
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable)
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  1: (()+0x9e6334) [0x558c5f2fb334]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  2: (()+0x11390) [0x7fe24f6ce390]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  3: (gsignal()+0x38) [0x7fe24dc14428]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  4: (abort()+0x16a) [0x7fe24dc1602a]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  5: (OSDMonitor::build_incremental(unsigned int, unsigned int, unsigned long)+0x9c5) [0x558c5ee80455]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  6: (OSDMonitor::send_incremental(unsigned int, MonSession*, bool, boost::intrusive_ptr<MonOpRequest>)+0xcf) [0x558c5ee80b3f]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  7: (OSDMonitor::check_osdmap_sub(Subscription*)+0x22d) [0x558c5ee8622d]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  8: (Monitor::handle_subscribe(boost::intrusive_ptr<MonOpRequest>)+0x1082) [0x558c5ecdb0b2]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  9: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0x9f4) [0x558c5ed05114]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  10: (Monitor::_ms_dispatch(Message*)+0x6db) [0x558c5ed061ab]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  11: (Monitor::ms_dispatch(Message*)+0x23) [0x558c5ed372c3]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  12: (DispatchQueue::entry()+0xf4a) [0x558c5f2a205a]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  13: (DispatchQueue::DispatchThread::entry()+0xd) [0x558c5f035dcd]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  14: (()+0x76ba) [0x7fe24f6c46ba]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  15: (clone()+0x6d) [0x7fe24dce641d]
May 31 02:26:16 ctlr101 ceph-mon[2632098]:  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
May 31 02:26:16 ctlr101 systemd[1]: ceph-mon@ctlr101.service: Main process exited, code=killed, status=6/ABRT
May 31 02:26:16 ctlr101 systemd[1]: ceph-mon@ctlr101.service: Unit entered failed state.
May 31 02:26:16 ctlr101 systemd[1]: ceph-mon@ctlr101.service: Failed with result 'signal'.
May 31 02:26:26 ctlr101 systemd[1]: ceph-mon@ctlr101.service: Service hold-off time over, scheduling restart.
May 31 02:26:26 ctlr101 systemd[1]: Stopped Ceph cluster monitor daemon.
May 31 02:26:26 ctlr101 systemd[1]: Started Ceph cluster monitor daemon.

For command ceph -s, most of time, it's timeout. Sometimes when I have 3+ mon services are up, I can get result, but mon service become down very quickly. 

root@ctlr101:~# ceph -s
  cluster:
    id:     53264466-680b-42e6-899d-d042c3a8334a
    health: HEALTH_ERR
            6 osds down
            1 host (6 osds) down
            74266/2566020 objects misplaced (2.894%)
            Reduced data availability: 446 pgs inactive, 440 pgs peering
            Degraded data redundancy: 108173/2566020 objects degraded (4.216%), 142 pgs degraded, 330 pgs undersized
            18600 slow requests are blocked > 32 sec. Implicated osds 8,21,27,29,32,41,63,91,96,98,100
            27371 stuck requests are blocked > 4096 sec. Implicated osds 14,25,26,34,37,46,48,50,51,58,59,60,61,66,67,69,73,74,75,90,95,99
            2/5 mons down, quorum ctlr101,ctlr201,ctlr301

  services:
    mon: 5 daemons, quorum ctlr101,ctlr201,ctlr301, out of quorum: ceph101, ceph201
    mgr: ceph101(active), standbys: ceph301, ctlr201, ctlr301, ceph201, ctlr101
    mds: cephfs-1/1/1 up  {0=ceph101=up:active}, 2 up:standby
    osd: 52 osds: 46 up, 52 in; 22 remapped pgs
    rgw: 3 daemons active

  data:
    pools:   20 pools, 2528 pgs
    objects: 855.34k objects, 3.69TiB
    usage:   11.4TiB used, 28.3TiB / 39.7TiB avail
    pgs:     0.237% pgs unknown
             17.445% pgs not active
             108173/2566020 objects degraded (4.216%)
             74266/2566020 objects misplaced (2.894%)
             1667 active+clean
             413  peering
             198  active+undersized
             141  active+undersized+degraded
             60   active+remapped+backfill_wait
             27   remapped+peering
             12   active+clean+remapped
             6    unknown
             2    active+undersized+remapped
             1    active+undersized+degraded+remapped+backfilling
             1    remapped

  io:
    client:   5.65MiB/s rd, 81.1KiB/s wr, 143op/s rd, 43op/s wr

Note, the about io data is stale, the value hasn't been changed for 1 day. If anyone can give me some hints how to keep the mon service running, it will be great. Thanks in advance. 

Best Regards,
Li JianYu

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com