Re: Need help for PG problem

koukou73gr <koukou73gr@xxxxxxxxx> · Wed, 23 Mar 2016 15:38:26 +0200

Are you runnig with the default failure domain of 'host'?

If so, with a pool size of 3 and your 20 OSDs physically only on 2 hosts
Ceph is unable to find a 3rd host to map the 3rd replica.

Either add a host and move some OSDs there or reduce pool size to 2.

-K.

On 03/23/2016 02:17 PM, Zhang Qiang wrote:
> And here's the osd tree if it matters.
> 
> ID WEIGHT   TYPE NAME       UP/DOWN REWEIGHT PRIMARY-AFFINITY 
> -1 22.39984 root default                                      
> -2 21.39984     host 10                                       
>  0  1.06999         osd.0        up  1.00000          1.00000 
>  1  1.06999         osd.1        up  1.00000          1.00000 
>  2  1.06999         osd.2        up  1.00000          1.00000 
>  3  1.06999         osd.3        up  1.00000          1.00000 
>  4  1.06999         osd.4        up  1.00000          1.00000 
>  5  1.06999         osd.5        up  1.00000          1.00000 
>  6  1.06999         osd.6        up  1.00000          1.00000 
>  7  1.06999         osd.7        up  1.00000          1.00000 
>  8  1.06999         osd.8        up  1.00000          1.00000 
>  9  1.06999         osd.9        up  1.00000          1.00000 
> 10  1.06999         osd.10       up  1.00000          1.00000 
> 11  1.06999         osd.11       up  1.00000          1.00000 
> 12  1.06999         osd.12       up  1.00000          1.00000 
> 13  1.06999         osd.13       up  1.00000          1.00000 
> 14  1.06999         osd.14       up  1.00000          1.00000 
> 15  1.06999         osd.15       up  1.00000          1.00000 
> 16  1.06999         osd.16       up  1.00000          1.00000 
> 17  1.06999         osd.17       up  1.00000          1.00000 
> 18  1.06999         osd.18       up  1.00000          1.00000 
> 19  1.06999         osd.19       up  1.00000          1.00000 
> -3  1.00000     host 148_96                                   
>  0  1.00000         osd.0        up  1.00000          1.00000
> 
> On Wed, 23 Mar 2016 at 19:10 Zhang Qiang <dotslash.lu@xxxxxxxxx
> <mailto:dotslash.lu@xxxxxxxxx>> wrote:
> 
>     Oliver, Goncalo, 
> 
>     Sorry to disturb again, but recreating the pool with a smaller
>     pg_num didn't seem to work, now all 666 pgs are degraded + undersized.
> 
>     New status:
>         cluster d2a69513-ad8e-4b25-8f10-69c4041d624d
>          health HEALTH_WARN
>                 666 pgs degraded
>                 82 pgs stuck unclean
>                 666 pgs undersized
>          monmap e5: 5 mons at
>     {1=10.3.138.37:6789/0,2=10.3.138.39:6789/0,3=10.3.138.40:6789/0,4=10.3.138.59:6789/0,GGZ-YG-S0311-PLATFORM-138=10.3.138.36:6789/0
>     <http://10.3.138.37:6789/0,2=10.3.138.39:6789/0,3=10.3.138.40:6789/0,4=10.3.138.59:6789/0,GGZ-YG-S0311-PLATFORM-138=10.3.138.36:6789/0>}
>                 election epoch 28, quorum 0,1,2,3,4
>     GGZ-YG-S0311-PLATFORM-138,1,2,3,4
>          osdmap e705: 20 osds: 20 up, 20 in
>           pgmap v1961: 666 pgs, 1 pools, 0 bytes data, 0 objects
>                 13223 MB used, 20861 GB / 21991 GB avail
>                      666 active+undersized+degraded
> 
>     Only one pool and its size is 3. So I think according to the
>     algorithm, (20 * 100) / 3 = 666 pgs is reasonable.
> 
>     I updated health detail and also attached a pg query result on
>     gist(https://gist.github.com/dotSlashLu/22623b4cefa06a46e0d4).
> 
>     On Wed, 23 Mar 2016 at 09:01 Dotslash Lu <dotslash.lu@xxxxxxxxx
>     <mailto:dotslash.lu@xxxxxxxxx>> wrote:
> 
>         Hello Gonçalo,
> 
>         Thanks for your reminding. I was just setting up the cluster for
>         test, so don't worry, I can just remove the pool. And I learnt
>         that since the replication number and pool number are related to
>         pg_num, I'll consider them carefully before deploying any data. 
> 
>         On Mar 23, 2016, at 6:58 AM, Goncalo Borges
>         <goncalo.borges@xxxxxxxxxxxxx
>         <mailto:goncalo.borges@xxxxxxxxxxxxx>> wrote:
> 
>>         Hi Zhang...
>>
>>         If I can add some more info, the change of PGs is a heavy
>>         operation, and as far as i know, you should NEVER decrease
>>         PGs. From the notes in pgcalc (http://ceph.com/pgcalc/):
>>
>>         "It's also important to know that the PG count can be
>>         increased, but NEVER decreased without destroying / recreating
>>         the pool. However, increasing the PG Count of a pool is one of
>>         the most impactful events in a Ceph Cluster, and should be
>>         avoided for production clusters if possible."
>>
>>         So, in your case, I would consider in adding more OSDs.
>>
>>         Cheers
>>         Goncalo
> 
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com