Placement groups forever in "creating" state and dont map to OSD

Yogesh_Devi@xxxxxxxx (Yogesh_Devi at Dell.com) · Mon, 4 Aug 2014 10:37:00 +0000

Hi Kapil,
The crush map is below
# begin crush map
# devices
device 0 osd.0
device 1 osd.1
# types
type 0 osd
type 1 host
type 2 rack
type 3 row
type 4 room
type 5 datacenter
type 6 root

# buckets
root default {
        id -1           # do not change unnecessarily
        # weight 1.000
        alg straw
        hash 0  # rjenkins1
        item osd.0 weight 0.500
        item osd.1 weight 0.500
}

# rules
rule data {
        ruleset 0
        type replicated
        min_size 1
        max_size 10
        step take default
        step chooseleaf firstn 0 type host
        step emit
}
rule metadata {
        ruleset 1
        type replicated
        min_size 1
        max_size 10
        step take default
        step chooseleaf firstn 0 type host
        step emit
}
rule rbd {
        ruleset 2 type replicated
        min_size 1
        max_size 10
        step take default
        step chooseleaf firstn 0 type host
        step emit
}
# end crush map

Yogesh Devi,
Architect,  Dell Cloud Clinical Archive Dell

Land Phone     +91 80 28413000 Extension - 2781
Hand Phone    +91 99014 71082

-----Original Message-----
From: Kapil Sharma [mailto:ksharma@xxxxxxxx]
Sent: Monday, August 04, 2014 3:59 PM
To: Devi, Yogesh
Cc: matt at cactuar.net; ceph-users at lists.ceph.com; Pulicken, Antony
Subject: Re: Placement groups forever in "creating" state and dont map to OSD

I think ceph osd tree should list your OSDs under the node bucket.
Could you check your osd crush map also with this -

ceph osd getcrushmap -o filename
crushtool -d filename -o filename.txt

you should see your OSDs in the #devices section and you should see your three servers in the #buckets section.

Regards,
Kapil.

On Mon, 2014-08-04 at 10:09 +0000, Yogesh_Devi at Dell.com wrote:
> Dell - Internal Use - Confidential
>
> Hi Kapil
>
> Thanks for responding J
>
> My Mon-server two OSD's are running on three separate servers one for
> respective node. All are SLES sp3.
>
>
>
> Below is "ceph osd tree" output from my mon server box
>
>
>
> slesceph1: # ceph osd tree
>
> # id    weight  type name       up/down reweight
>
> -1      1       root default
>
> 0       0.5             osd.0   up      1
>
> 1       0.5             osd.1   up      1
>
> Yogesh
>
>
>
> Land Phone     +91 80 28413000 Extension - 2781
> Hand Phone    +91 99014 71082
>
>
>
> -----Original Message-----
> From: Kapil Sharma [mailto:ksharma at suse.com]
> Sent: Monday, August 04, 2014 3:31 PM
> To: Devi, Yogesh
> Cc: matt at cactuar.net; ceph-users at lists.ceph.com; Pulicken, Antony
> Subject: Re: [ceph-users] Placement groups forever in "creating" state
> and dont map to OSD
>
> Hi Yogesh,
>
> Are your two OSDs on same node ? Could you check the osd tree output
> with the command - "ceph osd tree"
>
>
>
> Regards,
> Kapil.
>
>
>
> On Mon, 2014-08-04 at 09:22 +0000, Yogesh_Devi at Dell.com wrote:
> > Dell - Internal Use - Confidential
> >
> > Matt
> >
> > I am using Suse Enterprise Linux 11 - SP3 ( SLES SP3)
> >
> >
> >
> > I don't think I have enabled SE Linux ..
> >
> >
> >
> > Yogesh Devi,
> >
> > Architect,  Dell Cloud Clinical Archive
> >
> > Dell
> >
> >
> >
> >
> >
> > Land Phone     +91 80 28413000 Extension - 2781
> >
> > Hand Phone    +91 99014 71082
> >
> >
> >
> >
> > From: Matt Harlum [mailto:matt at cactuar.net]
> > Sent: Monday, August 04, 2014 1:43 PM
> > To: Devi, Yogesh
> > Cc: ceph-users at lists.ceph.com; Pulicken, Antony
> > Subject: Re: [ceph-users] Placement groups forever in "creating"
> state
> > and dont map to OSD
> >
> >
> >
> >
> > Hi
> >
> >
> >
> >
> > What distributions are your machines using? and is SELinux enabled
> on
> > them?
> >
> >
> >
> >
> >
> > I ran into the same issue once, i had to disable SELinux on all the
> > machines and then reinstall
> >
> >
> >
> >
> >
> >
> >
> > On 4 Aug 2014, at 5:25 pm, Yogesh_Devi at Dell.com wrote:
> >
> >
> >
> >
> > Dell - Internal Use - Confidential
> >
> > Matt
> >
> >
> > Thanks for responding
> >
> >
> > As suggested I tried to set replication to 2X by usng commands you
> > provided
> >
> >
> >
> >
> >
> > $ceph osd pool set data size 2
> >
> >
> > $ceph osd pool set data min_size 2
> >
> >
> > $ceph osd pool set rbd size 2
> >
> >
> > $ceph osd pool set rbd min_size 2
> >
> >
> > $ceph osd pool set metadata size 2
> >
> >
> > $ceph osd pool set metadata min_size 2
> >
> >
> >
> >
> >
> > It told me -
> >
> >
> > set pool 0 size to 2
> >
> >
> > set pool 0 min_size to 2
> >
> >
> > set pool 2 size to 2
> >
> >
> > set pool 2 min_size to 2
> >
> >
> > set pool 1 size to 2
> >
> >
> > set pool 1 min_size to 2
> >
> >
> >
> >
> >
> > To verify that pool size had indeed changed - I checked again
> >
> >
> >
> >
> >
> > $ceph osd dump | grep 'rep size'
> >
> >
> > pool 0 'data' rep size 2 crush_ruleset 0 object_hash rjenkins
> pg_num
> > 64 pgp_num 64 last_change 90 owner 0 crash_replay_interval 45
> >
> >
> > pool 1 'metadata' rep size 2 crush_ruleset 1 object_hash rjenkins
> > pg_num 64 pgp_num 64 last_change 94 owner 0
> >
> >
> > pool 2 'rbd' rep size 2 crush_ruleset 2 object_hash rjenkins pg_num
> 64
> > pgp_num 64 last_change 92 owner 0
> >
> >
> > pool 3 'datapool' rep size 2 crush_ruleset 2 object_hash rjenkins
> > pg_num 10 pgp_num 10 last_change 38 owner 0
> >
> >
> >
> >
> >
> >
> >
> >
> > However - my cluster is still in same state
> >
> >
> >
> >
> >
> > $ceph -s
> >
> >
> >    health HEALTH_WARN 202 pgs stuck inactive; 202 pgs stuck unclean
> >
> >
> >    monmap e1: 1 mons at {slesceph1=160.110.73.200:6789/0}, election
> > epoch 1, quorum 0 slesceph1
> >
> >
> >    osdmap e106: 2 osds: 2 up, 2 in
> >
> >
> >     pgmap v171: 202 pgs: 202 creating; 0 bytes data, 10306 MB used,
> > 71573 MB / 81880 MB avail
> >
> >
> >    mdsmap e1: 0/0/1 up
> >
> >
> > Yogesh Devi,
> >
> >
> > Architect,  Dell Cloud Clinical Archive
> >
> >
> > Dell
> >
> >
> >
> >
> >
> >
> >
> >
> > Land Phone     +91 80 28413000 Extension - 2781
> >
> >
> > Hand Phone    +91 99014 71082
> >
> >
> >
> >
> >
> > From: Matt Harlum [mailto:matt at cactuar.net]
> > Sent: Saturday, August 02, 2014 6:01 AM
> > To: Devi, Yogesh
> > Cc: Pulicken, Antony
> > Subject: Re: [ceph-users] Placement groups forever in "creating"
> state
> > and dont map to OSD
> >
> >
> >
> >
> >
> > Hi Yogesh,
> >
> >
> >
> >
> >
> > By default ceph is configured to create 3 replicas of the data, with
> > only 3 OSDs it cannot create all of the pgs required to do this
> >
> >
> >
> >
> >
> > You will need to change the replication to 2x for your pools, this
> can
> > be done like so:
> >
> >
> > ceph odd pool set data size 2
> >
> >
> > ceph odd pool set data min_size 2
> >
> >
> > ceph odd pool set rbd size 2
> >
> >
> > ceph odd pool set rbd min_size 2
> >
> >
> > ceph odd pool set metadata size 2
> >
> >
> > ceph odd pool set metadata min_size 2
> >
> >
> >
> >
> >
> > Once you do this your ceph cluster should go to a healthy state.
> >
> >
> >
> >
> >
> > Regards,
> >
> >
> > Matt
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On 2 Aug 2014, at 12:57 am, Yogesh_Devi at Dell.com wrote:
> >
> >
> >
> >
> >
> > Dell - Internal Use - Confidential
> >
> > Hello Ceph Experts J ,
> >
> >
> >
> >
> >
> > I am using ceph ( ceph version 0.56.6) on Suse linux.
> >
> >
> > I created a simple cluster with one monitor server and two OSDs .
> >
> >
> > The conf file is attached
> >
> >
> >
> >
> >
> > When  start my cluster - and do "ceph -s" -  I see following
> message
> >
> >
> >
> >
> >
> > $ceph -s"
> >
> >
> > health HEALTH_WARN 202 pgs stuck inactive; 202 pgs stuck unclean
> >
> >
> >    monmap e1: 1 mons at {slesceph1=160.110.73.200:6789/0}, election
> > epoch 1, quorum 0 slesceph1
> >
> >
> >    osdmap e56: 2 osds: 2 up, 2 in
> >
> >
> >     pgmap v100: 202 pgs: 202 creating; 0 bytes data, 10305 MB used,
> > 71574 MB / 81880 MB avail
> >
> >
> >    mdsmap e1: 0/0/1 up
> >
> >
> >
> >
> >
> >
> >
> >
> > Basically there is some problem with my placement groups - they are
> > forever stuck in "creating" state and there is no OSD associated
> with
> > them ( despite having two OSD's that are up and in" ) - when I do a
> > ceph pg stat" I see as follows
> >
> >
> >
> >
> >
> > $ceph pg stat
> >
> >
> > v100: 202 pgs: 202 creating; 0 bytes data, 10305 MB used, 71574
> MB /
> > 81880 MB avail
> >
> >
> >
> >
> >
> >
> >
> >
> > if I query any individual pg - then I see it isn't mapped to any
> OSD
> >
> >
> > $ ceph pg 0.d query
> >
> >
> > pgid currently maps to no osd
> >
> >
> >
> >
> >
> > I tried restaring OSDs and tuning my configuration without any
> avail
> >
> >
> >
> >
> >
> > Any suggestions ?
> >
> >
> >
> >
> >
> > Yogesh Devi
> >
> >
> > <ceph.conf>_______________________________________________
> > ceph-users mailing list
> > ceph-users at lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> >
> >
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users at lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140804/82a3c0b2/attachment.htm>