osdmap several thousand epochs behind latest

Chris Apsey <bitskrieg@xxxxxxxxxxxxx> · Sun, 09 Jul 2017 20:06:17 -0400

All,

Had a fairly substantial network interruption that knocked out about 
~270 osds:

     health HEALTH_ERR
            [...]
            273/384 in osds are down
            noup,nodown,noout flag(s) set
     monmap e2: 3 mons at 
{cephmon-0=10.10.6.0:6789/0,cephmon-1=10.10.6.1:6789/0,cephmon-2=10.10.6.2:6789/0}
            election epoch 138, quorum 0,1,2 
cephmon-0,cephmon-1,cephmon-2
        mgr no daemons active
     osdmap e37718: 384 osds: 111 up, 384 in; 16764 remapped pgs
            flags 
noup,nodown,noout,sortbitwise,require_jewel_osds,require_kraken_osds

We've had network interruptions before, and normally OSDs come back on 
their own, or do so with a service restart.  This time, no such luck 
(I'm guessing the scale was just too much).  After a few hours of trying 
to figure out why OSD services were running on the hosts (according to 
systemd) but marked 'down' in ceph osd tree, I found this thread: 
http://ceph-devel.vger.kernel.narkive.com/ftEN7TOU/70-osd-are-down-and-not-coming-up 
which appears to perfectly describe the scenario (high CPU usage, osdmap 
way out of sync, etc.)

I've taken the steps outlined and set the appropriate flags and am 
monitoring the 'catch up' progress of the OSDs.  The OSD farthest behind 
is about 5000 epochs out of sync, so I assume it will be a few hours 
before I see CPU usage level out.

Once the OSDs are caught up, are there any other steps I should take 
before 'ceph osd unset noup' (or anything to do after)?

Thanks in advance,

--
v/r

Chris Apsey
bitskrieg@xxxxxxxxxxxxx
https://www.bitskrieg.net
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com