Re: Bright new cluster get all pgs stuck in inactive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Your CRUSH rule specifies to select 3 different chassis but your CRUSH
ma defines no chassis.
Add buckets of type chassis or change the rule to select hosts.

Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

On Tue, Jan 29, 2019 at 7:40 PM PHARABOT Vincent
<Vincent.PHARABOT@xxxxxxx> wrote:
>
> Sorry JC, here is the correct osd crush rule dump (type=chassis instead of host)
>
>
>
> # ceph osd crush rule dump
>
> [
>
> {
>
> "rule_id": 0,
>
> "rule_name": "replicated_rule",
>
> "ruleset": 0,
>
> "type": 1,
>
> "min_size": 1,
>
> "max_size": 10,
>
> "steps": [
>
> {
>
> "op": "take",
>
> "item": -1,
>
> "item_name": "default"
>
> },
>
> {
>
> "op": "chooseleaf_firstn",
>
> "num": 0,
>
> "type": "chassis"
>
> },
>
> {
>
> "op": "emit"
>
> }
>
> ]
>
> }
>
> ]
>
>
>
> De : ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] De la part de PHARABOT Vincent
> Envoyé : mardi 29 janvier 2019 19:33
> À : Jean-Charles Lopez <jelopez@xxxxxxxxxx>
> Cc : ceph-users@xxxxxxxxxxxxxx
> Objet : Re:  Bright new cluster get all pgs stuck in inactive
>
>
>
> Thanks for the quick reply
>
>
>
> Here is the result
>
>
>
> # ceph osd crush rule dump
>
> [
>
>     {
>
>         "rule_id": 0,
>
>         "rule_name": "replicated_rule",
>
>         "ruleset": 0,
>
>         "type": 1,
>
>         "min_size": 1,
>
>         "max_size": 10,
>
>         "steps": [
>
>             {
>
>                 "op": "take",
>
>                 "item": -1,
>
>                 "item_name": "default"
>
>             },
>
>             {
>
>                 "op": "chooseleaf_firstn",
>
>                 "num": 0,
>
>                 "type": "host"
>
>             },
>
>             {
>
>                "op": "emit"
>
>             }
>
>         ]
>
>     }
>
> ]
>
>
>
> De : Jean-Charles Lopez [mailto:jelopez@xxxxxxxxxx]
> Envoyé : mardi 29 janvier 2019 19:30
> À : PHARABOT Vincent <Vincent.PHARABOT@xxxxxxx>
> Cc : ceph-users@xxxxxxxxxxxxxx
> Objet : Re:  Bright new cluster get all pgs stuck in inactive
>
>
>
> Hi,
>
>
>
> I suspect your generated CRUSH rule is incorret because of osd_crush_cooseleaf_type=2 and by default chassis bucket are not created.
>
>
>
> Changing the type of bucket to host (osd_crush_cooseleaf_type=1 which is the default when using old ceph-deploy or ceph-ansible) for your deployment should fix the problem.
>
>
>
> Could you show the output of ceph osd crush rule dump to verify how the rule was built
>
>
>
> JC
>
>
>
> On Jan 29, 2019, at 10:08, PHARABOT Vincent <Vincent.PHARABOT@xxxxxxx> wrote:
>
>
>
> Hello,
>
>
>
> I have a bright new cluster with 2 pools, but cluster keeps pgs in inactive state.
>
> I have 3 OSDs and 1 Mon… all seems ok except I could not have pgs in clean+active state !
>
>
>
> I might miss something obvious but I really don’t know what…. Someone could help me ?
>
> I tried to seek answers among the list mail threads, but no luck, other situation seems different
>
>
>
> Thank you a lot for your help
>
>
>
> Vincent
>
>
>
> # ceph -v
>
> ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable)
>
>
>
> # ceph -s
>
> cluster:
>
> id: ff4c91fb-3c29-4d9f-a26f-467d6b6a712e
>
> health: HEALTH_WARN
>
> Reduced data availability: 200 pgs inactive
>
>
>
> services:
>
> mon: 1 daemons, quorum ip-10-8-66-123.eu-west-2.compute.internal
>
> mgr: ip-10-8-66-123.eu-west-2.compute.internal(active)
>
> osd: 3 osds: 3 up, 3 in
>
>
>
> data:
>
> pools: 2 pools, 200 pgs
>
> objects: 0 objects, 0 B
>
> usage: 3.0 GiB used, 2.9 TiB / 2.9 TiB avail
>
> pgs: 100.000% pgs unknown
>
> 200 unknown
>
>
>
> # ceph osd tree -f json-pretty
>
>
>
> {
>
>     "nodes": [
>
>         {
>
>             "id": -1,
>
>             "name": "default",
>
>             "type": "root",
>
>             "type_id": 10,
>
>             "children": [
>
>                 -3,
>
>                 -5,
>
>                 -7
>
>             ]
>
>         },
>
>         {
>
>             "id": -7,
>
>             "name": "ip-10-8-10-108",
>
>             "type": "host",
>
>             "type_id": 1,
>
>             "pool_weights": {},
>
>             "children": [
>
>                 2
>
>             ]
>
>         },
>
>         {
>
>             "id": 2,
>
>             "device_class": "hdd",
>
>             "name": "osd.2",
>
>             "type": "osd",
>
>             "type_id": 0,
>
>             "crush_weight": 0.976593,
>
>             "depth": 2,
>
>             "pool_weights": {},
>
>             "exists": 1,
>
>             "status": "up",
>
>             "reweight": 1.000000,
>
>             "primary_affinity": 1.000000
>
>         },
>
>         {
>
>             "id": -5,
>
>             "name": "ip-10-8-22-148",
>
>             "type": "host",
>
>             "type_id": 1,
>
>             "pool_weights": {},
>
>             "children": [
>
>                 1
>
>             ]
>
>         },
>
>         {
>
>             "id": 1,
>
>             "device_class": "hdd",
>
>             "name": "osd.1",
>
>             "type": "osd",
>
>             "type_id": 0,
>
>             "crush_weight": 0.976593,
>
>             "depth": 2,
>
>             "pool_weights": {},
>
>             "exists": 1,
>
>             "status": "up",
>
>             "reweight": 1.000000,
>
>             "primary_affinity": 1.000000
>
>         },
>
>         {
>
>             "id": -3,
>
>             "name": "ip-10-8-5-246",
>
>             "type": "host",
>
>             "type_id": 1,
>
>             "pool_weights": {},
>
>             "children": [
>
>                 0
>
>             ]
>
>         },
>
>         {
>
>             "id": 0,
>
>             "device_class": "hdd",
>
>             "name": "osd.0",
>
>             "type": "osd",
>
>             "type_id": 0,
>
>             "crush_weight": 0.976593,
>
>            "depth": 2,
>
>             "pool_weights": {},
>
>             "exists": 1,
>
>             "status": "up",
>
>             "reweight": 1.000000,
>
>             "primary_affinity": 1.000000
>
>         }
>
>     ],
>
>     "stray": []
>
> }
>
>
>
> # cat /etc/ceph/ceph.conf
>
> [global]
>
> fsid = ff4c91fb-3c29-4d9f-a26f-467d6b6a712e
>
> mon initial members = ip-10-8-66-123
>
> mon host = 10.8.66.123
>
> auth_cluster_required = cephx
>
> auth_service_required = cephx
>
> auth_client_required = cephx
>
> pid file = /var/run/$cluster/$type.pid
>
>
>
>
>
> #Choose reasonable numbers for number of replicas and placement groups.
>
> osd pool default size = 3 # Write an object 3 times
>
> osd pool default min size = 2 # Allow writing 2 copy in a degraded state
>
> osd pool default pg num = 100
>
> osd pool default pgp num = 100
>
>
>
> #Choose a reasonable crush leaf type
>
> #0 for a 1-node cluster.
>
> #1 for a multi node cluster in a single rack
>
> #2 for a multi node, multi chassis cluster with multiple hosts in a chassis
>
> #3 for a multi node cluster with hosts across racks, etc.
>
> osd crush chooseleaf type = 2
>
>
>
> [mon]
>
>         debug mon = 20
>
>
>
> # ceph health detail
>
> HEALTH_WARN Reduced data availability: 200 pgs inactive
>
> PG_AVAILABILITY Reduced data availability: 200 pgs inactive
>
>     pg 1.46 is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.47 is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.48 is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.49 is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.4a is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.4b is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.4c is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.4d is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.4e is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.4f is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.50 is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.51 is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.52 is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.53 is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.54 is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.55 is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.56 is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.57 is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.58 is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.59 is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.5a is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.5b is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.5c is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.5d is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.5e is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 1.5f is stuck inactive for 10848.068201, current state unknown, last acting []
>
>     pg 2.44 is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.48 is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.49 is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.4a is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.4b is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.4c is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.4d is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.4e is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.4f is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.50 is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.51 is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.52 is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.53 is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.54 is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.55 is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.56 is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.57 is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.58 is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.59 is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.5a is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.5b is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.5c is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.5d is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.5e is stuck inactive for 10846.400420, current state unknown, last acting []
>
>     pg 2.5f is stuck inactive for 10846.400420, current state unknown, last acting []
>
> This email and any attachments are intended solely for the use of the individual or entity to whom it is addressed and may be confidential and/or privileged.
>
> If you are not one of the named recipients or have received this email in error,
>
> (i) you should not read, disclose, or copy it,
>
> (ii) please notify sender of your receipt by reply email and delete this email and all attachments,
>
> (iii) Dassault Systèmes does not accept or assume any liability or responsibility for any use of or reliance on this email.
>
>
>
> Please be informed that your personal data are processed according to our data privacy policy as described on our website. Should you have any questions related to personal data protection, please contact 3DS Data Protection Officer at 3DS.compliance-privacy@xxxxxxx
>
>
>
> For other languages, go to https://www.3ds.com/terms/email-disclaimer
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> This email and any attachments are intended solely for the use of the individual or entity to whom it is addressed and may be confidential and/or privileged.
>
> If you are not one of the named recipients or have received this email in error,
>
> (i) you should not read, disclose, or copy it,
>
> (ii) please notify sender of your receipt by reply email and delete this email and all attachments,
>
> (iii) Dassault Systèmes does not accept or assume any liability or responsibility for any use of or reliance on this email.
>
>
>
> Please be informed that your personal data are processed according to our data privacy policy as described on our website. Should you have any questions related to personal data protection, please contact 3DS Data Protection Officer at 3DS.compliance-privacy@xxxxxxx
>
>
>
> For other languages, go to https://www.3ds.com/terms/email-disclaimer
>
> This email and any attachments are intended solely for the use of the individual or entity to whom it is addressed and may be confidential and/or privileged.
>
> If you are not one of the named recipients or have received this email in error,
>
> (i) you should not read, disclose, or copy it,
>
> (ii) please notify sender of your receipt by reply email and delete this email and all attachments,
>
> (iii) Dassault Systèmes does not accept or assume any liability or responsibility for any use of or reliance on this email.
>
>
> Please be informed that your personal data are processed according to our data privacy policy as described on our website. Should you have any questions related to personal data protection, please contact 3DS Data Protection Officer at 3DS.compliance-privacy@xxxxxxx
>
>
> For other languages, go to https://www.3ds.com/terms/email-disclaimer
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux