Firefly OSDs stuck in creating state forever

sweil@xxxxxxxxxx (Sage Weil) · Sun, 3 Aug 2014 08:29:35 -0700 (PDT)

Hi Bruce,

On Sun, 3 Aug 2014, Bruce McFarland wrote:
> Yes I looked at tcpdump on each of the OSDs and saw communications between
> all 3 OSDs before I sent my first question to this list. When I disabled
> selinux on the one offending server based on your feedback (typically we
> have this disabled on lab systems that are only on the lab net) the 10 pages
> in my test pool all went to ?active+clean? almost immediately. Unfortunately
> the 3 default pools still remain in the creating states and are not
> health_ok. The OSDs all stayed UP/IN after the selinux change for the rest
> of the day until I made the mistake of creating a RBD image on demo-pool and
> it?s 10 ?active+clean? pages. I created the rbd, but when I attempted to
> look at it with ?rbd info? the cluster went into an endless loop? trying to
> read a placement group and loop that I left running overnight. This morning

What do you mean by "went into an endless loop"?

> ceph-mon was crashed again. I?ll probably start all over from scratch once
> again on Monday.

Was there a stack dump in the mon log?

It is possible that there is a bug with pool creation that surfaced 
by having selinux in place for so long, but otherwise this scenario 
doesn't make much sense to me.  :/  Very interested in hearing more, 
and/or whether you can reproduce it.

Thanks!
sage

> 
> ?
> 
> I deleted ceph-mds and got rid of the ?laggy? comments from ?ceph health?.
> The ?official? online Ceph docs on that ?coming soon? and most references I
> could find were pre firefly so it was a little trail and error to figure out
> to use the pool number and not it?s name to get the removal to work. Same
> with ?ceph mds newfs? to get rid of ?laggy-ness? in the ?ceph health?
> output.
> 
> ?
> 
> [root at essperf3 Ceph]# ceph mds rm 0? mds.essperf3
> 
> mds gid 0 dne
> 
> [root at essperf3 Ceph]# ceph health
> 
> HEALTH_WARN 96 pgs incomplete; 96 pgs peering; 192 pgs stuck inactive; 192
> pgs stuck unclean mds essperf3 is laggy
> 
> [root at essperf3 Ceph]# ceph mds newfs 1 0? --yes-i-really-mean-it
> 
> new fs with metadata pool 1 and data pool 0
> 
> [root at essperf3 Ceph]# ceph health
> 
> HEALTH_WARN 96 pgs incomplete; 96 pgs peering; 192 pgs stuck inactive; 192
> pgs stuck unclean
> 
> [root at essperf3 Ceph]#
> 
> ?
> 
> ?
> 
> ?
> 
> From: Brian Rak [mailto:brak at gameservers.com]
> Sent: Friday, August 01, 2014 6:14 PM
> To: Bruce McFarland; ceph-users at lists.ceph.com
> Subject: Re: [ceph-users] Firefly OSDs stuck in creating state forever
> 
> ?
> 
> What happens if you remove nodown?? I'd be interested to see what OSDs it
> thinks are down. My next thought would be tcpdump on the private interface.?
> See if the OSDs are actually managing to connect to each other.
> 
> For comparison, when I bring up a cluster of 3 OSDs it goes to HEALTH_OK
> nearly instantly (definitely under a minute!), so it's probably not just
> taking awhile.
> 
> Does 'ceph osd dump' show the proper public and private IPs?
> 
> On 8/1/2014 6:13 PM, Bruce McFarland wrote:
> 
>       MDS: I assumed that I?d need to bring up a ceph-mds for my
>       cluster at initial bringup. We also intended to modify the CRUSH
>       map such that it?s pool is resident to SSD(s). It is one of the
>       areas of the online docs there doesn?t seem to be a lot of info
>       on and I haven?t spent a lot of time researching. I?ll stop it.
> 
>       ?
> 
>       OSD connectivity:? The connectivity is good for both 1GE and
>       10GE. I thought moving to 10GE with nothing else on that net
>       might help with group placement etc and bring up the pages
>       quicker. I?ve checked ?tcpdump? output on all boxes.
> 
>       Firewall: Thanks for that one - it?s the ?basic? I over looked
>       in my ceph learning curve. One of the OSDs had selinux=enforcing
>       ? all others were disabled. Changing that box and the 10 pages
>       in my demo-pool (kept page count very small for sanity) are now
>       ?active+clean?. The pages for the default pools ? data,
>       metadata, rbd ? are still stuck in ?creating+peering or
>       creating+incomplete. I did have to use manually set ?osd pool
>       default min size = 1? from it?s default of 2 ?for these 3 pools
>       to eliminate a bunch of warnings in the ?ceph health detail?
>       output.
> 
>       I?m adding the [mon] setting ?you suggested below and stopping
>       ceph-mds and bringing everything up now.
> 
>       [root at essperf3 Ceph]# ceph -s
> 
>       ??? cluster 4b3ffe60-73f4-4512-b7da-b04e4775dd73
> 
>       ???? health HEALTH_WARN 96 pgs incomplete; 96 pgs peering; 192
>       pgs stuck inactive; 192 pgs stuck unclean; 28 requests are
>       blocked > 32 sec; nodown,noscrub flag(s) set
> 
>       ???? monmap e1: 1 mons at {essperf3=209.243.160.35:6789/0},
>       election epoch 1, quorum 0 essperf3
> 
>       ???? mdsmap e43: 1/1/1 up {0=essperf3=up:creating}
> 
>       ???? osdmap e752: 3 osds: 3 up, 3 in
> 
>       ??????????? flags nodown,noscrub
> 
>       ????? pgmap v1483: 202 pgs, 4 pools, 0 bytes data, 0 objects
> 
>       ??????????? 134 MB used, 1158 GB / 1158 GB avail
> 
>       ????????????????? 96 creating+peering
> 
>       ????????????????? 10 active+clean
>       <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<!!!!!!!!
> 
>       ????????????????? 96 creating+incomplete
> 
>       [root at essperf3 Ceph]#
> 
>       ?
> 
>       From: Brian Rak [mailto:brak at gameservers.com]
>       Sent: Friday, August 01, 2014 2:54 PM
>       To: Bruce McFarland; ceph-users at lists.ceph.com
>       Subject: Re: [ceph-users] Firefly OSDs stuck in creating state
>       forever
> 
> ?
> 
> Why do you have a MDS active?? I'd suggest getting rid of that at
> least until you have everything else working.
> 
> I see you've set nodown on the OSDs, did you have problems with the
> OSDs flapping?? Do the OSDs have broken connectivity between
> themselves?? Do you have some kind of firewall interfering here?
> 
> I've seen odd issues when the OSDs have broken private networking,
> you'll get one OSD marking all the other ones down.? Adding this to my
> config helped:
> 
> [mon]
> mon osd min down reporters = 2
> 
> 
> On 8/1/2014 5:41 PM, Bruce McFarland wrote:
> 
>       Hello,
> 
>       I?ve run out of ideas and assume I?ve overlooked something
>       very basic. I?ve created 2 ceph clusters in the last 2
>       weeks with different OSD HW and private network fabrics ?
>       1GE and 10GE. I have never been? able to get the OSDs to
>       come up to the ?active+clean? state. I have followed your
>       online documentation and at this point the only thing I
>       don?t think I?ve done is modifying the CRUSH map (although
>       I have been looking into that). These are new clusters
>       with no data and only 1 HDD and 1 SSD per OSD (24 2.5Ghz
>       cores with 64GB RAM).
> 
>       ?
> 
>       Since the disks are being recycled is there something I
>       need to flag to let ceph just create it?s mappings, but
>       not scrub for data compatibility? I?ve tried setting the
>       noscrub flag to no effect.
> 
>       ?
> 
>       I also have constant OSD flapping. I?ve set nodown, but
>       assume that is just masking a problem that still
>       occurring.
> 
>       ?
> 
>       Besides the lack of ever reaching ?active+clean? state
>       ceph-mon always crashes after leaving it running
>       overnight. The OSDs all eventually fill /root with with
>       ceph logs so I regularly have to bring everything down
>       Delete logs and restart.
> 
>       ?
> 
>       I have all sorts of output from the ceph.conf; osd boot
>       ouput with ?debug osd -= 20? and ?debug ms = 1?; ceph ?w
>       output; and pretty much all of the debug/monitoring
>       suggestions from the online docs and 2 weeks of google
>       searches from online references in blogs, mailing lists
>       etc.
> 
>       ?
> 
>       [root at essperf3 Ceph]# ceph -v
> 
>       ceph version 0.80.1
>       (a38fe1169b6d2ac98b427334c12d7cf81f809b74)
> 
>       [root at essperf3 Ceph]# ceph -s
> 
>       ??? cluster 4b3ffe60-73f4-4512-b7da-b04e4775dd73
> 
>       ???? health HEALTH_WARN 96 pgs incomplete; 106 pgs
>       peering; 202 pgs stuck inactive; 202 pgs stuck unclean;
>       nodown,noscrub flag(s) set
> 
>       ???? monmap e1: 1 mons at
>       {essperf3=209.243.160.35:6789/0}, election epoch 1, quorum
>       0 essperf3
> 
>       ???? mdsmap e43: 1/1/1 up {0=essperf3=up:creating}
> 
>       ???? osdmap e752: 3 osds: 3 up, 3 in
> 
>       ??????????? flags nodown,noscrub
> 
>       ????? pgmap v1476: 202 pgs, 4 pools, 0 bytes data, 0
>       objects
> 
>       ??????????? 134 MB used, 1158 GB / 1158 GB avail
> 
>       ???????????????? 106 creating+peering
> 
>       ???????????? ?????96 creating+incomplete
> 
>       [root at essperf3 Ceph]#
> 
>       ?
> 
>       Suggestions?
> 
>       Thanks,
> 
>       Bruce
> 
> 
> 
> 
> 
> _______________________________________________
> 
> ceph-users mailing list
> 
> ceph-users at lists.ceph.com
> 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> ?
> 
> ?
> 
> 
>