OSD stuck in booting state while monitor show it as been up

Frank Li <frli@xxxxxxxxxxxxxxxxxxxx> · Sat, 3 Feb 2018 00:32:54 +0000

Running ceph 12.2.2 in Centos 7.4.  The  cluster was in healthy condition until a command caused all the monitors to crash.
Applied a private build for fixing the issue (thanks !)
https://tracker.ceph.com/issues/22847

the monitors are all started, and all the OSDs are reported as been up in ceph -s, but the OSD themselves are reporting as
“Booting”, so none of the PGs recovered:
(Please see attached OSD debug logs, seems to be looping through STATE_ACCEPTING_WAIT_BANNER_ADDR)
ceph -s
  cluster:
    id:     021a1428-fea5-4697-bcd0-a45c1c2ca80b
    health: HEALTH_WARN
            Reduced data availability: 10240 pgs inactive, 3 pgs down, 4195 pgs peering

  services:
    mon: 5 daemons, quorum dl1-kaf101,dl1-kaf201,dl1-kaf301,dl1-kaf302,dl1-kaf401
    mgr: dl1-kaf101(active)
    osd: 64 osds: 64 up, 64 in; 100 remapped pgs

  data:
    pools:   3 pools, 10240 pgs
    objects: 94810 objects, 366 GB
    usage:   2376 GB used, 515 TB / 518 TB avail
    pgs:     59.004% pgs unknown
             40.996% pgs not active
             6042 unknown
             4195 peering
             3    down

OSD output:
ceph --admin-daemon /var/run/ceph/dl1approd-osd.3.asok status
{
    "cluster_fsid": "021a1428-fea5-4697-bcd0-a45c1c2ca80b",
    "osd_fsid": "63d816a2-beb3-4b94-8f34-62fa1ffc32ce",
    "whoami": 3,
    "state": "booting",
    "oldest_map": 133275,
    "newest_map": 133997,
    "num_pgs": 439
}
Config:
[global]
debug ms  = 5/5
debug heartbeatmap = 5/5

    mon osd down out interval = 30
    mon osd min down reports = 2

    osd heartbeat grace = 35
    osd mon heartbeat interval = 20
    osd mon report interval max = 30
    osd mon ack timeout = 15

    fsid = 021a1428-fea5-4697-bcd0-a45c1c2ca80b
    auth cluster required = cephx
    auth service required = cephx
    auth client required = cephx
    mon osd allow primary affinity = true

-- 

Efficiency is Intelligent Laziness

Attachment:
osd.log

Description: osd.log
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com