Re: Gentoo & ceph 0.67 & pg stuck After fresh Installation

Sage Weil <sage@xxxxxxxxxxx> · Sun, 19 Jan 2014 20:59:26 -0800 (PST)

On Sun, 19 Jan 2014, Sherry Shahbazi wrote:
> Hi Philipp,
> 
> Installing "ntp" on each server might solve the clock skew problem.

At the very least a onetime 'ntpdate time.apple.com' should make that 
issue go away for the time being.

s

>  
> Best Regards
> Sherry
> 
> 
> On Sunday, January 19, 2014 6:34 AM, Philipp Strobl <philipp@xxxxxxxxxxxx>
> wrote:
> HI Aaron,
> 
> sorry for taking so long...
> 
> After i add the osd and buckets to the crushmap i get
> 
> ceph osd tree
> # id    weight    type name    up/down    reweight
> -3    1    host dp2
> 1    1        osd.1    up    1   
> -2    1    host dp1
> 0    1        osd.0    up    1   
> -1    0    root default
> 
> 
> Both osds are up and in
> 
> ceph osd stat
> e25: 2 osds: 2 up, 2 in
> 
> ceph health detail says:
> 
> HEALTH_WARN 292 pgs stuck inactive; 292 pgs stuck unclean; clock skew
> detected on mon.vmsys-dp2
> pg 3.f is stuck inactive since forever, current state creating, last acting
> []
> pg 0.c is stuck inactive since forever, current state creating, last acting
> []
> pg 1.d is stuck inactive since forever, current state creating, last acting
> []
> pg 2.e is stuck inactive since forever, current state creating, last acting
> []
> pg 3.8 is stuck inactive since forever, current state creating, last acting
> []
> pg 0.b is stuck inactive since forever, current state creating, last acting
> []
> pg 1.a is stuck inactive since forever, current state creating, last acting
> []
> ...
> pg 2.c is stuck unclean since forever, current state creating, last acting
> []
> pg 1.f is stuck unclean since forever, current state creating, last acting
> []
> pg 0.e is stuck unclean since forever, current state creating, last acting
> []
> pg 3.d is stuck unclean since forever, current state creating, last acting
> []
> pg 2.f is stuck unclean since forever, current state creating, last acting
> []
> pg 1.c is stuck unclean since forever, current state creating, last acting
> []
> pg 0.d is stuck unclean since forever, current state creating, last acting
> []
> pg 3.e is stuck unclean since forever, current state creating, last acting
> []
> mon.vmsys-dp2 addr 10.0.0.22:6789/0 clock skew 16.4914s > max 0.05s (latency
> 0.00666228s)
> 
> All pgs have the same status.
> 
> Is the clock skew an important fact ?
> 
> I compiled ceph like this - eix ceph:
> ...
> Installed versions:  0.67{tbz2}(00:54:50 01/08/14)(fuse -debug -gtk
> -libatomic -radosgw -static-libs -tcmalloc)
>  
> cluster name is vmsys, servers are dp1 and dp2
> config:
> 
> [global]
>     auth cluster required = none
>     auth service required = none
>     auth client required = none
>     auth supported = none
>     fsid = 265d12ac-e99d-47b9-9651-05cb2b4387a6
> 
> [mon.vmsys-dp1]
>     host = dp1
>     mon addr = INTERNAL-IP1:6789
>     mon data = /var/lib/ceph/mon/ceph-vmsys-dp1
> 
> [mon.vmsys-dp2]
>     host = dp2
>     mon addr = INTERNAL-IP2:6789
>     mon data = /var/lib/ceph/mon/ceph-vmsys-dp2
> 
> [osd]
> [osd.0]
>     host = dp1
>     devs = /dev/sdb1
>     osd_mkfs_type = xfs
>     osd data = /var/lib/ceph/osd/ceph-0
> 
> [osd.1]
>     host = dp2
>     devs = /dev/sdb1
>     osd_mkfs_type = xfs
>     osd data = /var/lib/ceph/osd/ceph-1
> 
> [mds.vmsys-dp1]
>         host = dp1
> 
> [mds.vmsys-dp2]
>         host = dp2
> 
> 
> 
> Hope this is helpful - i really don't know at the moment what is wrong.
> 
> Perhaps i try the manual-deploy howto from inktank or do you have an idea ?
> 
> 
> 
> Best Philipp
> 
> http://www.pilarkto.net
> Am 10.01.2014 20:50, schrieb Aaron Ten Clay:
>       Hi Philipp,
> 
> It sounds like perhaps you don't have any OSDs that are both "up" and
> "in" the cluster. Can you provide the output of "ceph health detail"
> and "ceph osd tree" for us?
> 
> As for the "howto" you mentioned, I added some notes to the top but
> never really updated the body of the document... I'm not entirely sure
> it's straightforward or up to date any longer :) I'd be happy to make
> changes as needed but I haven't manually deployed a cluster in several
> months, and Inktank now has a manual deployment guide for Ceph at
> http://ceph.com/docs/master/install/manual-deployment/
> 
> -Aaron
> 
> 
> 
> On Fri, Jan 10, 2014 at 6:57 AM, Philipp Strobl <philipp@xxxxxxxxxxxx>
> wrote:
>       Hi,
> 
> After managed to deploy ceph manual in gentoo (ceph-disk tools
> are under /usr/usr/sbin...), the daemons are coming properly up,
> but "ceph health" shows warn for all pgs stuck unclean.
> This is a strange behavior for a clean new installtion i guess.
> 
> So the question is, do i'm something wrong Or can i reset the
> PGs for getting the Cluster Running ?
> 
> Also the rbd-Client Or Mount.ceph Hangs with no answer.
> 
> I used thishowto: https://github.com/aarontc/ansible-playbooks/blob/master/roles/ceph.
> notes-on-deployment.rst
> 
> Resp. our German translation/expansion
> http://wiki.open-laboratory.de/Intern:IT:HowTo:Ceph
> 
> With auth Support ... = none
> 
> 
> Best regards 
> And thank you in advance
> 
> Philipp Strobl
> 
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> 
> --
> Aaron Ten Clay
> http://www.aarontc.com/
> 
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com