Yes it was the crush map. I updated it, distributed 20 OSDs across 2 hosts correctly, finally all pgs are healthy.
Thanks guys, I really appreciate your help!
On Thu, 24 Mar 2016 at 07:25 Goncalo Borges <goncalo.borges@xxxxxxxxxxxxx> wrote:
Hi Zhang...
I think you are dealing with two different problems.
The first problem refers to number of PGs per OSD. That was already discussed, and now there is no more messages concerning it.
The second problem you are experiencing seems to be that all your OSDs are under the same host. Besides that, osd.0 appears twice in two different hosts (I do not really know why is that happening). If you are using the default crush rules, ceph is not able to replicate objects (even with size 2) across two different hosts because all your OSDs are just in one host.
Cheers
Goncalo
From: Zhang Qiang [dotslash.lu@xxxxxxxxx]
Sent: 23 March 2016 23:17
To: Goncalo Borges
Cc: Oliver Dzombic; ceph-users
Subject: Re: Need help for PG problem
And here's the osd tree if it matters.
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY-1 22.39984 root default-2 21.39984 host 100 1.06999 osd.0 up 1.00000 1.000001 1.06999 osd.1 up 1.00000 1.000002 1.06999 osd.2 up 1.00000 1.000003 1.06999 osd.3 up 1.00000 1.000004 1.06999 osd.4 up 1.00000 1.000005 1.06999 osd.5 up 1.00000 1.000006 1.06999 osd.6 up 1.00000 1.000007 1.06999 osd.7 up 1.00000 1.000008 1.06999 osd.8 up 1.00000 1.000009 1.06999 osd.9 up 1.00000 1.0000010 1.06999 osd.10 up 1.00000 1.0000011 1.06999 osd.11 up 1.00000 1.0000012 1.06999 osd.12 up 1.00000 1.0000013 1.06999 osd.13 up 1.00000 1.0000014 1.06999 osd.14 up 1.00000 1.0000015 1.06999 osd.15 up 1.00000 1.0000016 1.06999 osd.16 up 1.00000 1.0000017 1.06999 osd.17 up 1.00000 1.0000018 1.06999 osd.18 up 1.00000 1.0000019 1.06999 osd.19 up 1.00000 1.00000-3 1.00000 host 148_960 1.00000 osd.0 up 1.00000 1.00000
On Wed, 23 Mar 2016 at 19:10 Zhang Qiang <dotslash.lu@xxxxxxxxx> wrote:
Oliver, Goncalo,
Sorry to disturb again, but recreating the pool with a smaller pg_num didn't seem to work, now all 666 pgs are degraded + undersized.
New status:cluster d2a69513-ad8e-4b25-8f10-69c4041d624dhealth HEALTH_WARN666 pgs degraded82 pgs stuck unclean666 pgs undersizedmonmap e5: 5 mons at {1=10.3.138.37:6789/0,2=10.3.138.39:6789/0,3=10.3.138.40:6789/0,4=10.3.138.59:6789/0,GGZ-YG-S0311-PLATFORM-138=10.3.138.36:6789/0}election epoch 28, quorum 0,1,2,3,4 GGZ-YG-S0311-PLATFORM-138,1,2,3,4osdmap e705: 20 osds: 20 up, 20 inpgmap v1961: 666 pgs, 1 pools, 0 bytes data, 0 objects13223 MB used, 20861 GB / 21991 GB avail666 active+undersized+degraded
Only one pool and its size is 3. So I think according to the algorithm, (20 * 100) / 3 = 666 pgs is reasonable.
I updated health detail and also attached a pg query result on gist(https://gist.github.com/dotSlashLu/22623b4cefa06a46e0d4).
On Wed, 23 Mar 2016 at 09:01 Dotslash Lu <dotslash.lu@xxxxxxxxx> wrote:
Hello Gonçalo,
Thanks for your reminding. I was just setting up the cluster for test, so don't worry, I can just remove the pool. And I learnt that since the replication number and pool number are related to pg_num, I'll consider them carefully before deploying any data.Hi Zhang...
If I can add some more info, the change of PGs is a heavy operation, and as far as i know, you should NEVER decrease PGs. From the notes in pgcalc (http://ceph.com/pgcalc/):
"It's also important to know that the PG count can be increased, but NEVER decreased without destroying / recreating the pool. However, increasing the PG Count of a pool is one of the most impactful events in a Ceph Cluster, and should be avoided for production clusters if possible."
So, in your case, I would consider in adding more OSDs.
Cheers
Goncalo
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com