Ignore the kdump comment the kernel didn't crash only Ceph-mon. I'll save that portion of the Ceph-mon log. Sent from my iPhone > On Aug 3, 2014, at 9:58 AM, "Bruce McFarland" <Bruce.McFarland at taec.toshiba.com> wrote: > > Is there a recommended way to take every thing down and restart the process? I was considering starting completely from scratch ie OS reinstall and then using Ceph-deploy as before. > I've learned a lot and want to figure out a fool proof way I can document for others in our lab to bring up a cluster on new HW. I learn a lot more when I break things and have to figure out what went wrong so its a little frustrating, but I've found out a lot about verifying the configuration and debug options so far. My intent is to investigate rbd usage, perf, and configuration options. > > The "endless loop" I'm referring to is a constant stream of fault messages that I'm not yet familiar on how to interpret. I have let them run to see if the cluster recovers, but Ceph-mon always crashed. I'll look for the crash dump and save it since kdump should be enabled on the monitor box. > > Thanks for the feedback. > > >> On Aug 3, 2014, at 8:30 AM, "Sage Weil" <sweil at redhat.com> wrote: >> >> Hi Bruce, >> >>> On Sun, 3 Aug 2014, Bruce McFarland wrote: >>> Yes I looked at tcpdump on each of the OSDs and saw communications between >>> all 3 OSDs before I sent my first question to this list. When I disabled >>> selinux on the one offending server based on your feedback (typically we >>> have this disabled on lab systems that are only on the lab net) the 10 pages >>> in my test pool all went to ?active+clean? almost immediately. Unfortunately >>> the 3 default pools still remain in the creating states and are not >>> health_ok. The OSDs all stayed UP/IN after the selinux change for the rest >>> of the day until I made the mistake of creating a RBD image on demo-pool and >>> it?s 10 ?active+clean? pages. I created the rbd, but when I attempted to >>> look at it with ?rbd info? the cluster went into an endless loop trying to >>> read a placement group and loop that I left running overnight. This morning >> >> What do you mean by "went into an endless loop"? >> >>> ceph-mon was crashed again. I?ll probably start all over from scratch once >>> again on Monday. >> >> Was there a stack dump in the mon log? >> >> It is possible that there is a bug with pool creation that surfaced >> by having selinux in place for so long, but otherwise this scenario >> doesn't make much sense to me. :/ Very interested in hearing more, >> and/or whether you can reproduce it. >> >> Thanks! >> sage >> >> >>> >>> >>> >>> I deleted ceph-mds and got rid of the ?laggy? comments from ?ceph health?. >>> The ?official? online Ceph docs on that ?coming soon? and most references I >>> could find were pre firefly so it was a little trail and error to figure out >>> to use the pool number and not it?s name to get the removal to work. Same >>> with ?ceph mds newfs? to get rid of ?laggy-ness? in the ?ceph health? >>> output. >>> >>> >>> >>> [root at essperf3 Ceph]# ceph mds rm 0 mds.essperf3 >>> >>> mds gid 0 dne >>> >>> [root at essperf3 Ceph]# ceph health >>> >>> HEALTH_WARN 96 pgs incomplete; 96 pgs peering; 192 pgs stuck inactive; 192 >>> pgs stuck unclean mds essperf3 is laggy >>> >>> [root at essperf3 Ceph]# ceph mds newfs 1 0 --yes-i-really-mean-it >>> >>> new fs with metadata pool 1 and data pool 0 >>> >>> [root at essperf3 Ceph]# ceph health >>> >>> HEALTH_WARN 96 pgs incomplete; 96 pgs peering; 192 pgs stuck inactive; 192 >>> pgs stuck unclean >>> >>> [root at essperf3 Ceph]# >>> >>> >>> >>> >>> >>> >>> >>> From: Brian Rak [mailto:brak at gameservers.com] >>> Sent: Friday, August 01, 2014 6:14 PM >>> To: Bruce McFarland; ceph-users at lists.ceph.com >>> Subject: Re: [ceph-users] Firefly OSDs stuck in creating state forever >>> >>> >>> >>> What happens if you remove nodown? I'd be interested to see what OSDs it >>> thinks are down. My next thought would be tcpdump on the private interface. >>> See if the OSDs are actually managing to connect to each other. >>> >>> For comparison, when I bring up a cluster of 3 OSDs it goes to HEALTH_OK >>> nearly instantly (definitely under a minute!), so it's probably not just >>> taking awhile. >>> >>> Does 'ceph osd dump' show the proper public and private IPs? >>> >>> On 8/1/2014 6:13 PM, Bruce McFarland wrote: >>> >>> MDS: I assumed that I?d need to bring up a ceph-mds for my >>> cluster at initial bringup. We also intended to modify the CRUSH >>> map such that it?s pool is resident to SSD(s). It is one of the >>> areas of the online docs there doesn?t seem to be a lot of info >>> on and I haven?t spent a lot of time researching. I?ll stop it. >>> >>> >>> >>> OSD connectivity: The connectivity is good for both 1GE and >>> 10GE. I thought moving to 10GE with nothing else on that net >>> might help with group placement etc and bring up the pages >>> quicker. I?ve checked ?tcpdump? output on all boxes. >>> >>> Firewall: Thanks for that one - it?s the ?basic? I over looked >>> in my ceph learning curve. One of the OSDs had selinux=enforcing >>> ? all others were disabled. Changing that box and the 10 pages >>> in my demo-pool (kept page count very small for sanity) are now >>> ?active+clean?. The pages for the default pools ? data, >>> metadata, rbd ? are still stuck in creating+peering or >>> creating+incomplete. I did have to use manually set ?osd pool >>> default min size = 1? from it?s default of 2 for these 3 pools >>> to eliminate a bunch of warnings in the ?ceph health detail? >>> output. >>> >>> I?m adding the [mon] setting you suggested below and stopping >>> ceph-mds and bringing everything up now. >>> >>> [root at essperf3 Ceph]# ceph -s >>> >>> cluster 4b3ffe60-73f4-4512-b7da-b04e4775dd73 >>> >>> health HEALTH_WARN 96 pgs incomplete; 96 pgs peering; 192 >>> pgs stuck inactive; 192 pgs stuck unclean; 28 requests are >>> blocked > 32 sec; nodown,noscrub flag(s) set >>> >>> monmap e1: 1 mons at {essperf3=209.243.160.35:6789/0}, >>> election epoch 1, quorum 0 essperf3 >>> >>> mdsmap e43: 1/1/1 up {0=essperf3=up:creating} >>> >>> osdmap e752: 3 osds: 3 up, 3 in >>> >>> flags nodown,noscrub >>> >>> pgmap v1483: 202 pgs, 4 pools, 0 bytes data, 0 objects >>> >>> 134 MB used, 1158 GB / 1158 GB avail >>> >>> 96 creating+peering >>> >>> 10 active+clean >>> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<!!!!!!!! >>> >>> 96 creating+incomplete >>> >>> [root at essperf3 Ceph]# >>> >>> >>> >>> From: Brian Rak [mailto:brak at gameservers.com] >>> Sent: Friday, August 01, 2014 2:54 PM >>> To: Bruce McFarland; ceph-users at lists.ceph.com >>> Subject: Re: [ceph-users] Firefly OSDs stuck in creating state >>> forever >>> >>> >>> >>> Why do you have a MDS active? I'd suggest getting rid of that at >>> least until you have everything else working. >>> >>> I see you've set nodown on the OSDs, did you have problems with the >>> OSDs flapping? Do the OSDs have broken connectivity between >>> themselves? Do you have some kind of firewall interfering here? >>> >>> I've seen odd issues when the OSDs have broken private networking, >>> you'll get one OSD marking all the other ones down. Adding this to my >>> config helped: >>> >>> [mon] >>> mon osd min down reporters = 2 >>> >>> >>> On 8/1/2014 5:41 PM, Bruce McFarland wrote: >>> >>> Hello, >>> >>> I?ve run out of ideas and assume I?ve overlooked something >>> very basic. I?ve created 2 ceph clusters in the last 2 >>> weeks with different OSD HW and private network fabrics ? >>> 1GE and 10GE. I have never been able to get the OSDs to >>> come up to the ?active+clean? state. I have followed your >>> online documentation and at this point the only thing I >>> don?t think I?ve done is modifying the CRUSH map (although >>> I have been looking into that). These are new clusters >>> with no data and only 1 HDD and 1 SSD per OSD (24 2.5Ghz >>> cores with 64GB RAM). >>> >>> >>> >>> Since the disks are being recycled is there something I >>> need to flag to let ceph just create it?s mappings, but >>> not scrub for data compatibility? I?ve tried setting the >>> noscrub flag to no effect. >>> >>> >>> >>> I also have constant OSD flapping. I?ve set nodown, but >>> assume that is just masking a problem that still >>> occurring. >>> >>> >>> >>> Besides the lack of ever reaching ?active+clean? state >>> ceph-mon always crashes after leaving it running >>> overnight. The OSDs all eventually fill /root with with >>> ceph logs so I regularly have to bring everything down >>> Delete logs and restart. >>> >>> >>> >>> I have all sorts of output from the ceph.conf; osd boot >>> ouput with ?debug osd -= 20? and ?debug ms = 1?; ceph ?w >>> output; and pretty much all of the debug/monitoring >>> suggestions from the online docs and 2 weeks of google >>> searches from online references in blogs, mailing lists >>> etc. >>> >>> >>> >>> [root at essperf3 Ceph]# ceph -v >>> >>> ceph version 0.80.1 >>> (a38fe1169b6d2ac98b427334c12d7cf81f809b74) >>> >>> [root at essperf3 Ceph]# ceph -s >>> >>> cluster 4b3ffe60-73f4-4512-b7da-b04e4775dd73 >>> >>> health HEALTH_WARN 96 pgs incomplete; 106 pgs >>> peering; 202 pgs stuck inactive; 202 pgs stuck unclean; >>> nodown,noscrub flag(s) set >>> >>> monmap e1: 1 mons at >>> {essperf3=209.243.160.35:6789/0}, election epoch 1, quorum >>> 0 essperf3 >>> >>> mdsmap e43: 1/1/1 up {0=essperf3=up:creating} >>> >>> osdmap e752: 3 osds: 3 up, 3 in >>> >>> flags nodown,noscrub >>> >>> pgmap v1476: 202 pgs, 4 pools, 0 bytes data, 0 >>> objects >>> >>> 134 MB used, 1158 GB / 1158 GB avail >>> >>> 106 creating+peering >>> >>> 96 creating+incomplete >>> >>> [root at essperf3 Ceph]# >>> >>> >>> >>> Suggestions? >>> >>> Thanks, >>> >>> Bruce >>> >>> >>> >>> >>> >>> _______________________________________________ >>> >>> ceph-users mailing list >>> >>> ceph-users at lists.ceph.com >>> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >>> >>> >>> >>>