Is there a recommended way to take every thing down and restart the process? I was considering starting completely from scratch ie OS reinstall and then using Ceph-deploy as before. I've learned a lot and want to figure out a fool proof way I can document for others in our lab to bring up a cluster on new HW. I learn a lot more when I break things and have to figure out what went wrong so its a little frustrating, but I've found out a lot about verifying the configuration and debug options so far. My intent is to investigate rbd usage, perf, and configuration options. The "endless loop" I'm referring to is a constant stream of fault messages that I'm not yet familiar on how to interpret. I have let them run to see if the cluster recovers, but Ceph-mon always crashed. I'll look for the crash dump and save it since kdump should be enabled on the monitor box. Thanks for the feedback. > On Aug 3, 2014, at 8:30 AM, "Sage Weil" <sweil at redhat.com> wrote: > > Hi Bruce, > >> On Sun, 3 Aug 2014, Bruce McFarland wrote: >> Yes I looked at tcpdump on each of the OSDs and saw communications between >> all 3 OSDs before I sent my first question to this list. When I disabled >> selinux on the one offending server based on your feedback (typically we >> have this disabled on lab systems that are only on the lab net) the 10 pages >> in my test pool all went to ?active+clean? almost immediately. Unfortunately >> the 3 default pools still remain in the creating states and are not >> health_ok. The OSDs all stayed UP/IN after the selinux change for the rest >> of the day until I made the mistake of creating a RBD image on demo-pool and >> it?s 10 ?active+clean? pages. I created the rbd, but when I attempted to >> look at it with ?rbd info? the cluster went into an endless loop trying to >> read a placement group and loop that I left running overnight. This morning > > What do you mean by "went into an endless loop"? > >> ceph-mon was crashed again. I?ll probably start all over from scratch once >> again on Monday. > > Was there a stack dump in the mon log? > > It is possible that there is a bug with pool creation that surfaced > by having selinux in place for so long, but otherwise this scenario > doesn't make much sense to me. :/ Very interested in hearing more, > and/or whether you can reproduce it. > > Thanks! > sage > > >> >> >> >> I deleted ceph-mds and got rid of the ?laggy? comments from ?ceph health?. >> The ?official? online Ceph docs on that ?coming soon? and most references I >> could find were pre firefly so it was a little trail and error to figure out >> to use the pool number and not it?s name to get the removal to work. Same >> with ?ceph mds newfs? to get rid of ?laggy-ness? in the ?ceph health? >> output. >> >> >> >> [root at essperf3 Ceph]# ceph mds rm 0 mds.essperf3 >> >> mds gid 0 dne >> >> [root at essperf3 Ceph]# ceph health >> >> HEALTH_WARN 96 pgs incomplete; 96 pgs peering; 192 pgs stuck inactive; 192 >> pgs stuck unclean mds essperf3 is laggy >> >> [root at essperf3 Ceph]# ceph mds newfs 1 0 --yes-i-really-mean-it >> >> new fs with metadata pool 1 and data pool 0 >> >> [root at essperf3 Ceph]# ceph health >> >> HEALTH_WARN 96 pgs incomplete; 96 pgs peering; 192 pgs stuck inactive; 192 >> pgs stuck unclean >> >> [root at essperf3 Ceph]# >> >> >> >> >> >> >> >> From: Brian Rak [mailto:brak at gameservers.com] >> Sent: Friday, August 01, 2014 6:14 PM >> To: Bruce McFarland; ceph-users at lists.ceph.com >> Subject: Re: [ceph-users] Firefly OSDs stuck in creating state forever >> >> >> >> What happens if you remove nodown? I'd be interested to see what OSDs it >> thinks are down. My next thought would be tcpdump on the private interface. >> See if the OSDs are actually managing to connect to each other. >> >> For comparison, when I bring up a cluster of 3 OSDs it goes to HEALTH_OK >> nearly instantly (definitely under a minute!), so it's probably not just >> taking awhile. >> >> Does 'ceph osd dump' show the proper public and private IPs? >> >> On 8/1/2014 6:13 PM, Bruce McFarland wrote: >> >> MDS: I assumed that I?d need to bring up a ceph-mds for my >> cluster at initial bringup. We also intended to modify the CRUSH >> map such that it?s pool is resident to SSD(s). It is one of the >> areas of the online docs there doesn?t seem to be a lot of info >> on and I haven?t spent a lot of time researching. I?ll stop it. >> >> >> >> OSD connectivity: The connectivity is good for both 1GE and >> 10GE. I thought moving to 10GE with nothing else on that net >> might help with group placement etc and bring up the pages >> quicker. I?ve checked ?tcpdump? output on all boxes. >> >> Firewall: Thanks for that one - it?s the ?basic? I over looked >> in my ceph learning curve. One of the OSDs had selinux=enforcing >> ? all others were disabled. Changing that box and the 10 pages >> in my demo-pool (kept page count very small for sanity) are now >> ?active+clean?. The pages for the default pools ? data, >> metadata, rbd ? are still stuck in creating+peering or >> creating+incomplete. I did have to use manually set ?osd pool >> default min size = 1? from it?s default of 2 for these 3 pools >> to eliminate a bunch of warnings in the ?ceph health detail? >> output. >> >> I?m adding the [mon] setting you suggested below and stopping >> ceph-mds and bringing everything up now. >> >> [root at essperf3 Ceph]# ceph -s >> >> cluster 4b3ffe60-73f4-4512-b7da-b04e4775dd73 >> >> health HEALTH_WARN 96 pgs incomplete; 96 pgs peering; 192 >> pgs stuck inactive; 192 pgs stuck unclean; 28 requests are >> blocked > 32 sec; nodown,noscrub flag(s) set >> >> monmap e1: 1 mons at {essperf3=209.243.160.35:6789/0}, >> election epoch 1, quorum 0 essperf3 >> >> mdsmap e43: 1/1/1 up {0=essperf3=up:creating} >> >> osdmap e752: 3 osds: 3 up, 3 in >> >> flags nodown,noscrub >> >> pgmap v1483: 202 pgs, 4 pools, 0 bytes data, 0 objects >> >> 134 MB used, 1158 GB / 1158 GB avail >> >> 96 creating+peering >> >> 10 active+clean >> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<!!!!!!!! >> >> 96 creating+incomplete >> >> [root at essperf3 Ceph]# >> >> >> >> From: Brian Rak [mailto:brak at gameservers.com] >> Sent: Friday, August 01, 2014 2:54 PM >> To: Bruce McFarland; ceph-users at lists.ceph.com >> Subject: Re: [ceph-users] Firefly OSDs stuck in creating state >> forever >> >> >> >> Why do you have a MDS active? I'd suggest getting rid of that at >> least until you have everything else working. >> >> I see you've set nodown on the OSDs, did you have problems with the >> OSDs flapping? Do the OSDs have broken connectivity between >> themselves? Do you have some kind of firewall interfering here? >> >> I've seen odd issues when the OSDs have broken private networking, >> you'll get one OSD marking all the other ones down. Adding this to my >> config helped: >> >> [mon] >> mon osd min down reporters = 2 >> >> >> On 8/1/2014 5:41 PM, Bruce McFarland wrote: >> >> Hello, >> >> I?ve run out of ideas and assume I?ve overlooked something >> very basic. I?ve created 2 ceph clusters in the last 2 >> weeks with different OSD HW and private network fabrics ? >> 1GE and 10GE. I have never been able to get the OSDs to >> come up to the ?active+clean? state. I have followed your >> online documentation and at this point the only thing I >> don?t think I?ve done is modifying the CRUSH map (although >> I have been looking into that). These are new clusters >> with no data and only 1 HDD and 1 SSD per OSD (24 2.5Ghz >> cores with 64GB RAM). >> >> >> >> Since the disks are being recycled is there something I >> need to flag to let ceph just create it?s mappings, but >> not scrub for data compatibility? I?ve tried setting the >> noscrub flag to no effect. >> >> >> >> I also have constant OSD flapping. I?ve set nodown, but >> assume that is just masking a problem that still >> occurring. >> >> >> >> Besides the lack of ever reaching ?active+clean? state >> ceph-mon always crashes after leaving it running >> overnight. The OSDs all eventually fill /root with with >> ceph logs so I regularly have to bring everything down >> Delete logs and restart. >> >> >> >> I have all sorts of output from the ceph.conf; osd boot >> ouput with ?debug osd -= 20? and ?debug ms = 1?; ceph ?w >> output; and pretty much all of the debug/monitoring >> suggestions from the online docs and 2 weeks of google >> searches from online references in blogs, mailing lists >> etc. >> >> >> >> [root at essperf3 Ceph]# ceph -v >> >> ceph version 0.80.1 >> (a38fe1169b6d2ac98b427334c12d7cf81f809b74) >> >> [root at essperf3 Ceph]# ceph -s >> >> cluster 4b3ffe60-73f4-4512-b7da-b04e4775dd73 >> >> health HEALTH_WARN 96 pgs incomplete; 106 pgs >> peering; 202 pgs stuck inactive; 202 pgs stuck unclean; >> nodown,noscrub flag(s) set >> >> monmap e1: 1 mons at >> {essperf3=209.243.160.35:6789/0}, election epoch 1, quorum >> 0 essperf3 >> >> mdsmap e43: 1/1/1 up {0=essperf3=up:creating} >> >> osdmap e752: 3 osds: 3 up, 3 in >> >> flags nodown,noscrub >> >> pgmap v1476: 202 pgs, 4 pools, 0 bytes data, 0 >> objects >> >> 134 MB used, 1158 GB / 1158 GB avail >> >> 106 creating+peering >> >> 96 creating+incomplete >> >> [root at essperf3 Ceph]# >> >> >> >> Suggestions? >> >> Thanks, >> >> Bruce >> >> >> >> >> >> _______________________________________________ >> >> ceph-users mailing list >> >> ceph-users at lists.ceph.com >> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> >> >> >>