Problems during first install

chibi@xxxxxxx (Christian Balzer) · Wed, 6 Aug 2014 16:25:19 +0900



On Wed, 06 Aug 2014 09:18:13 +0200 Tijn Buijs wrote:

> Hello Pratik,
> 
> Thanks for this tip. It was the golden one :). I just deleted all my VMs 
> again and started over with (again) CentOS 6.5 and 1 OSD disk per data 
> VM of 20 GB dynamically allocated. And this time everything worked 
> correctly like they mentioned in the documentation :). I went on my way 
> and added a second OSD disk to each of the data nodes (also 20 GB 
> dynamically) and added that to my Ceph cluster. And this also worked:
> [ceph at ceph-admin testcluster]$ ceph health
> HEALTH_OK
> [ceph at ceph-admin testcluster]$ ceph -s
>      cluster 4125efe2-caa1-4bf8-8c6d-f10b2c71bf27
>       health HEALTH_OK
>       monmap e1: 1 mons at {ceph-mon1=10.28.28.71:6789/0}, election 
> epoch 1, quorum 0 ceph-mon1
>       osdmap e54: 6 osds: 6 up, 6 in
>        pgmap v104: 192 pgs, 3 pools, 0 bytes data, 0 objects
>              210 MB used, 91883 MB / 92093 MB avail
>                   192 active+clean
> 
> This is what I want to see :). All that is left to do now is increase 
> the number of monitors from 1 to 3 and I have a nice test environment 
> which resembles our production environment closely enough :). I started 
> this process already and it didn't work yet, but I will play around with 
> it some more. If I can't get it to work I will start a new thread :).
> Also I would like to understand why 10 GB per OSD isn't enough to store 
> nothing, but 20 GB per OSD is :).
> 
My guess would be that the journal (default of 5GB and definitely not
"nothing" ^o^) and all the other bits initially created are too much for
comfort in a 10GB disk.

Regards,

Christian

> Thnx everybody for your help!
> 
> Met vriendelijke groet/With kind regards,
> 
> Tijn Buijs
> 
> Cloud.nl logo
> 
> tijn at cloud.nl <mailto:tijn at cloud.nl> | T. 0800-CLOUDNL / +31 (0)162 820 
> 000 | F. +31 (0)162 820 001
> Cloud.nl B.V. | Minervum 7092D | 4817 ZK Breda | www.cloud.nl 
> <http://www.cloud.nl>
> On 05/08/14 12:13, Pratik Rupala wrote:
> > Hi Tijn,
> >
> > I had also created my first CEPH storage cluster almost as you have 
> > created. I had 3 VMs for OSD nodes and 1 VM for Monitor node.
> > All 3 OSD VMs were having one 10 GB virtual disks. so I faced almost 
> > same problem as you are facing right now.
> > Then changing disk space from 10 GB to 20 GB solved my problem.
> >
> > I don't know if dynamic disks will create any problem. But I think 
> > instead of having 6 OSDs you can have 3 OSDs, one OSD per VM and can 
> > increase disk size from 10 GB to 20 GB for 3 OSDs.
> >
> > I don't know this will solve your problem or not but worthy to try. I 
> > mean 3 OSDs are enough for testing purpose initially.
> >
> > Regards,
> > Pratik
> >
> >
> > On 8/5/2014 12:37 PM, Tijn Buijs wrote:
> >> Hello Pratik,
> >>
> >> I'm using virtual disks as OSDs. I prefer virtual disks over 
> >> directories because this resembles the production environment a bit 
> >> better.
> >> I'm using VirtualBox for virtualisation. The OSDs are dynamic disks, 
> >> not pre-allocated, but this shouldn't be a problem, right? I don't 
> >> have the diskspace on my iMac to have all 6 OSDs pre-allocated :). 
> >> I've made the virtual OSD disks 10 GB each, by the way, so that 
> >> should be enough for a first test, imho.
> >>
> >> Met vriendelijke groet/With kind regards,
> >>
> >> Tijn Buijs
> >>
> >> Cloud.nl logo
> >>
> >> tijn at cloud.nl <mailto:tijn at cloud.nl> | T. 0800-CLOUDNL / +31 (0)162 
> >> 820 000 | F. +31 (0)162 820 001
> >> Cloud.nl B.V. | Minervum 7092D | 4817 ZK Breda | www.cloud.nl 
> >> <http://www.cloud.nl>
> >> On 04/08/14 14:51, Pratik Rupala wrote:
> >>> Hi,
> >>>
> >>> You mentioned that you have 3 hosts which are VMs. Are you using 
> >>> simple directories as OSDs or virtual disks as OSDs?
> >>>
> >>> I had same problem few days back where enough space was not 
> >>> available from OSD for the cluster.
> >>>
> >>> Try to increase the size of disks if you are using virtual disks and 
> >>> if you are using directories as OSDs then check whether you have 
> >>> enough space on root device using df -h command on OSD node.
> >>>
> >>> Regards,
> >>> Pratik
> >>>
> >>> On 8/4/2014 4:11 PM, Tijn Buijs wrote:
> >>>> Hi Everybody,
> >>>>
> >>>> My idea was that maybe I was inpatient or something, so I let my 
> >>>> Ceph cluster running over the weekend. So from friday 15:00 until 
> >>>> now (it is monday morning 11:30 here now) it kept on running. And 
> >>>> it didn't help :). It still needs to create 192 PGs.
> >>>> I've reinstalled my entier cluster a few times now. I switched over 
> >>>> from CentOS 6.5 to Ubuntu 14.04.1 LTS and back to CentOS again, and 
> >>>> every time I get exactly the same results. The PGs are getting in 
> >>>> the incomplete, stuck inactive, stuk unclean state. What am I doing 
> >>>> wrong? :).
> >>>>
> >>>> For the moment I'm running with 6 OSDs evenly divided over 3 hosts 
> >>>> (so each host has 2 OSDs). I've only got 1 monitor configured in my 
> >>>> current cluster. I hit some other problem when trying to add 
> >>>> monitor 2 and 3 again. And to not complicate things with multiple 
> >>>> problems at the same time I've switched back to only 1 monitor. The 
> >>>> cluster should work that way, right?
> >>>>
> >>>> To make things clear for everybody, here is the output of ceph 
> >>>> health and ceph -s:
> >>>> $ ceph health
> >>>> HEALTH_WARN 192 pgs incomplete; 192 pgs stuck inactive; 192 pgs 
> >>>> stuck unclean
> >>>> $ ceph -s
> >>>>     cluster 43d5f48b-d034-4f50-bec8-5c4f3ad8276f
> >>>>      health HEALTH_WARN 192 pgs incomplete; 192 pgs stuck inactive; 
> >>>> 192 pgs stuck unclean
> >>>>      monmap e1: 1 mons at {ceph-mon1=10.28.28.71:6789/0}, election 
> >>>> epoch 1, quorum 0 ceph-mon1
> >>>>      osdmap e20: 6 osds: 6 up, 6 in
> >>>>       pgmap v40: 192 pgs, 3 pools, 0 bytes data, 0 objects
> >>>>             197 MB used, 30456 MB / 30653 MB avail
> >>>>                  192 incomplete
> >>>>
> >>>> I hope somebody has an idea for me to try :).
> >>>>
> >>>> Met vriendelijke groet/With kind regards,
> >>>>
> >>>> Tijn Buijs
> >>>>
> >>>> Cloud.nl logo
> >>>>
> >>>> tijn at cloud.nl <mailto:tijn at cloud.nl> | T. 0800-CLOUDNL / +31 (0)162 
> >>>> 820 000 | F. +31 (0)162 820 001
> >>>> Cloud.nl B.V. | Minervum 7092D | 4817 ZK Breda | www.cloud.nl 
> >>>> <http://www.cloud.nl>
> >>>> On 31/07/14 17:19, Alfredo Deza wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Thu, Jul 31, 2014 at 10:36 AM, Tijn Buijs <tijn at cloud.nl 
> >>>>> <mailto:tijn at cloud.nl>> wrote:
> >>>>>
> >>>>>     Hello everybody,
> >>>>>
> >>>>>     At cloud.nl <http://cloud.nl> we are going to use Ceph. So I
> >>>>>     find it a good idea to get some handson experience with it, so
> >>>>>     I can work with it :). So I'm installing a testcluster in a
> >>>>>     few VirtualBox machines on my iMac, which runs OS X 10.9.4
> >>>>>     offcourse. I know I will get a lousy performance, but that's
> >>>>>     not the objective here. The objective is to get some
> >>>>>     experience with Ceph, to see how it works.
> >>>>>
> >>>>>     But I hit an issue during the initial setup of the cluster.
> >>>>>     When I'm done installing everything and following the howto's
> >>>>>     on ceph.com <http://ceph.com> (the preflight
> >>>>>     <http://ceph.com/docs/master/start/quick-start-preflight/> and
> >>>>>     the Storage Cluster quick start
> >>>>>     <http://ceph.com/docs/master/start/quick-ceph-deploy/>) I need
> >>>>>     to run ceph health to see that everything is running
> >>>>>     perfectly. But it doesn't run perfectly, I get the following
> >>>>>     output:
> >>>>>     ceph at ceph-admin:~$ ceph health
> >>>>>     HEALTH_WARN 192 pgs incomplete; 192 pgs stuck inactive; 192
> >>>>>     pgs stuck unclean
> >>>>>
> >>>>>     And it stays at this information, it never ever changes. So
> >>>>>     everything is really stuck. But I don't know what is stuck
> >>>>>     exactly and how I can fix it. Some more info about my cluster:
> >>>>>     ceph at ceph-admin:~$ ceph -s
> >>>>>         cluster d31586a5-6dd6-454e-8835-0d6d9e204612
> >>>>>          health HEALTH_WARN 192 pgs incomplete; 192 pgs stuck
> >>>>>     inactive; 192 pgs stuck unclean
> >>>>>          monmap e3: 3 mons at
> >>>>>     {ceph-mon1=10.28.28.18:6789/0,ceph-mon2=10.28.28.31:6789/0,ceph-mon3=10.28.28.50:6789/0
> >>>>>     <http://10.28.28.18:6789/0,ceph-mon2=10.28.28.31:6789/0,ceph-mon3=10.28.28.50:6789/0>},
> >>>>>     election epoch 4, quorum 0,1,2 ceph-mon1,ceph-mon2,ceph-mon3
> >>>>>          osdmap e25: 6 osds: 6 up, 6 in
> >>>>>           pgmap v56: 192 pgs, 3 pools, 0 bytes data, 0 objects
> >>>>>                 197 MB used, 30455 MB / 30653 MB avail
> >>>>>                      192 creating+incomplete
> >>>>>
> >>>>>     I'm running on Ubuntu 14.04.1 LTS Server. I did try to get it
> >>>>>     running on CentOS 6.5 too (CentOS 6.5 is my actual distro of
> >>>>>     choice, but Ceph has more affinity with Ubuntu, so I tried
> >>>>>     that too), but I got exactly the same results.
> >>>>>
> >>>>>     But because this is my first install of Ceph I don't know the
> >>>>>     exact debug commands and stuff. I'm willing to get this
> >>>>>     working, but I just don't know how :). Any help is
> >>>>> appreciated :).
> >>>>>
> >>>>>
> >>>>> Did you use ceph-deploy? (the link to the quick start guide makes 
> >>>>> me think you did)
> >>>>>
> >>>>> If that was the case, did you get any warnings/errors at all?
> >>>>>
> >>>>> ceph-deploy is very verbose because some of these things are hard 
> >>>>> to debug. Mind sharing that output?
> >>>>>
> >>>>>
> >>>>>     Met vriendelijke groet/With kind regards,
> >>>>>
> >>>>>     Tijn Buijs
> >>>>>
> >>>>>     Cloud.nl logo
> >>>>>
> >>>>>     tijn at cloud.nl <mailto:tijn at cloud.nl> | T. 0800-CLOUDNL / +31
> >>>>>     (0)162 820 000 <tel:%2B31%20%280%29162%20820%20000> | F. +31
> >>>>>     (0)162 820 001 <tel:%2B31%20%280%29162%20820%20001>
> >>>>>     Cloud.nl B.V. | Minervum 7092D | 4817 ZK Breda | www.cloud.nl
> >>>>>     <http://www.cloud.nl>
> >>>>>
> >>>>>     _______________________________________________
> >>>>>     ceph-users mailing list
> >>>>>     ceph-users at lists.ceph.com <mailto:ceph-users at lists.ceph.com>
> >>>>>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> ceph-users mailing list
> >>>> ceph-users at lists.ceph.com
> >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>>
> >>
> >
> 


-- 
Christian Balzer        Network/Systems Engineer                
chibi at gol.com   	Global OnLine Japan/Fusion Communications
http://www.gol.com/