Re: Problems after increasing number of PGs in a pool

Paul Emmerich <paul.emmerich@xxxxxxxx> · Fri, 28 Sep 2018 20:04:37 +0200

I guess the pool is mapped to SSDs only from the name and you only got 20 SSDs.
So you should have about ~2000 effective PGs taking replication into account.

Your pool has ~10k effective PGs with k+m=5 and you seem to have 5
more pools....

Check "ceph osd df tree" to see how many PGs per OSD you got.

Try increasing these two options to "fix" it.

mon max pg per osd
osd max pg per osd hard ratio

Paul
Am Fr., 28. Sep. 2018 um 18:05 Uhr schrieb Vladimir Brik
<vladimir.brik@xxxxxxxxxxxxxxxx>:
>
> Hello
>
> I've attempted to increase the number of placement groups of the pools
> in our test cluster and now ceph status (below) is reporting problems. I
> am not sure what is going on or how to fix this. Troubleshooting
> scenarios in the docs don't seem to quite match what I am seeing.
>
> I have no idea how to begin to debug this. I see OSDs listed in
> "blocked_by" of pg dump, but don't know how to interpret that. Could
> somebody assist please?
>
> I attached output of "ceph pg dump_stuck -f json-pretty" just in case.
>
> The cluster consists of 5 hosts, each with 16 HDDs and 4 SSDs. I am
> running 13.2.2.
>
> This is the affected pool:
> pool 6 'fs-data-ec-ssd' erasure size 5 min_size 4 crush_rule 6
> object_hash rjenkins pg_num 2048 pgp_num 2048 last_change 2493 lfor
> 0/2491 flags hashpspool,ec_overwrites stripe_width 12288 application cephfs
>
>
> Thanks,
>
> Vlad
>
>
> ceph health
>
>   cluster:
>     id:     47caa1df-42be-444d-b603-02cad2a7fdd3
>     health: HEALTH_WARN
>             Reduced data availability: 155 pgs inactive, 47 pgs peering,
> 64 pgs stale
>             Degraded data redundancy: 321039/114913606 objects degraded
> (0.279%), 108 pgs degraded, 108 pgs undersized
>
>   services:
>     mon: 5 daemons, quorum ceph-1,ceph-2,ceph-3,ceph-4,ceph-5
>     mgr: ceph-3(active), standbys: ceph-2, ceph-5, ceph-1, ceph-4
>     mds: cephfs-1/1/1 up  {0=ceph-5=up:active}, 4 up:standby
>     osd: 100 osds: 100 up, 100 in; 165 remapped pgs
>
>   data:
>     pools:   6 pools, 5120 pgs
>     objects: 22.98 M objects, 88 TiB
>     usage:   154 TiB used, 574 TiB / 727 TiB avail
>     pgs:     3.027% pgs not active
>              321039/114913606 objects degraded (0.279%)
>              4903 active+clean
>              105  activating+undersized+degraded+remapped
>              61   stale+active+clean
>              47   remapped+peering
>              3    stale+activating+undersized+degraded+remapped
>              1    active+clean+scrubbing+deep
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com