Re: Need help for PG problem

koukou73gr <koukou73gr@xxxxxxxxx> · Wed, 23 Mar 2016 15:32:41 +0200

You should have settled with the nearest power of 2, which for 666 is
512. Since you created the cluster and IIRC is a testbed, you may as
well recreate it again, however it will less of a hassle to just
increase the pgs to the next power of two: 1024

Your 20 ods appear to be equal sized in your crushmap so ~150pgs / osd
should still be acceptable.

Hope you nail it this time :)

-K.

On 03/23/2016 01:10 PM, Zhang Qiang wrote:
> Oliver, Goncalo,
>
> Sorry to disturb again, but recreating the pool with a smaller pg_num
> didn't seem to work, now all 666 pgs are degraded + undersized.
>
> New status:
>     cluster d2a69513-ad8e-4b25-8f10-69c4041d624d
>      health HEALTH_WARN
>             666 pgs degraded
>             82 pgs stuck unclean
>             666 pgs undersized
>      monmap e5: 5 mons at
>
{1=10.3.138.37:6789/0,2=10.3.138.39:6789/0,3=10.3.138.40:6789/0,4=10.3.138.59:6789/0,GGZ-YG-S0311-PLATFORM-138=10.3.138.36:6789/0
>
<http://10.3.138.37:6789/0,2=10.3.138.39:6789/0,3=10.3.138.40:6789/0,4=10.3.138.59:6789/0,GGZ-YG-S0311-PLATFORM-138=10.3.138.36:6789/0>}
>             election epoch 28, quorum 0,1,2,3,4
> GGZ-YG-S0311-PLATFORM-138,1,2,3,4
>      osdmap e705: 20 osds: 20 up, 20 in
>       pgmap v1961: 666 pgs, 1 pools, 0 bytes data, 0 objects
>             13223 MB used, 20861 GB / 21991 GB avail
>                  666 active+undersized+degraded
>
> Only one pool and its size is 3. So I think according to the algorithm,
> (20 * 100) / 3 = 666 pgs is reasonable.
>
> I updated health detail and also attached a pg query result on
> gist(https://gist.github.com/dotSlashLu/22623b4cefa06a46e0d4).
>
> On Wed, 23 Mar 2016 at 09:01 Dotslash Lu <dotslash.lu@xxxxxxxxx
> <mailto:dotslash.lu@xxxxxxxxx>> wrote:
>
>     Hello Gonçalo,
>
>     Thanks for your reminding. I was just setting up the cluster for
>     test, so don't worry, I can just remove the pool. And I learnt that
>     since the replication number and pool number are related to pg_num,
>     I'll consider them carefully before deploying any data.
>
>     On Mar 23, 2016, at 6:58 AM, Goncalo Borges
>     <goncalo.borges@xxxxxxxxxxxxx <mailto:goncalo.borges@xxxxxxxxxxxxx>>
>     wrote:
>
>>     Hi Zhang...
>>
>>     If I can add some more info, the change of PGs is a heavy
>>     operation, and as far as i know, you should NEVER decrease PGs.
>>     From the notes in pgcalc (http://ceph.com/pgcalc/):
>>
>>     "It's also important to know that the PG count can be increased,
>>     but NEVER decreased without destroying / recreating the pool.
>>     However, increasing the PG Count of a pool is one of the most
>>     impactful events in a Ceph Cluster, and should be avoided for
>>     production clusters if possible."
>>
>>     So, in your case, I would consider in adding more OSDs.
>>
>>     Cheers
>>     Goncalo
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com