Oops, sorry about not including the version. Everything is running 12.2.4 on Ubuntu 16.04.
Below is the output from ceph osd df. The OSDs are pretty full, hence adding a new OSD node. I did have to bump up the nearfull ratio to .90 and reweight a few OSDs to bring them a little closer to the average.
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS
0 ssd 1.74649 1.00000 1788G 15688M 1773G 0.86 0.01 88
1 ssd 1.74649 1.00000 1788G 16489M 1772G 0.90 0.01 96
2 ssd 1.74649 1.00000 1788G 17224M 1771G 0.94 0.01 86
3 ssd 1.74649 1.00000 1788G 16745M 1772G 0.91 0.01 100
4 ssd 1.74649 1.00000 1788G 17016M 1771G 0.93 0.01 109
5 ssd 1.74649 1.00000 1788G 15964M 1772G 0.87 0.01 101
6 ssd 1.74649 1.00000 1788G 15612M 1773G 0.85 0.01 95
7 ssd 1.74649 1.00000 1788G 16109M 1772G 0.88 0.01 93
8 hdd 9.09560 1.00000 9313G 7511G 1802G 80.65 1.21 169
9 hdd 9.09560 1.00000 9313G 7155G 2158G 76.83 1.16 161
10 hdd 9.09560 1.00000 9313G 7953G 1360G 85.39 1.28 179
11 hdd 9.09560 0.95000 9313G 7821G 1492G 83.98 1.26 176
12 hdd 9.09560 1.00000 9313G 7193G 2120G 77.24 1.16 162
13 hdd 9.09560 1.00000 9313G 8131G 1182G 87.30 1.31 183
14 hdd 9.09560 1.00000 9313G 7643G 1670G 82.07 1.23 172
15 hdd 9.09560 1.00000 9313G 7019G 2294G 75.36 1.13 158
16 hdd 9.09560 1.00000 9313G 7419G 1894G 79.66 1.20 167
17 hdd 9.09560 1.00000 9313G 7333G 1980G 78.74 1.18 165
18 hdd 9.09560 1.00000 9313G 7107G 2206G 76.31 1.15 160
19 hdd 9.09560 1.00000 9313G 7288G 2025G 78.25 1.18 164
20 hdd 9.09560 1.00000 9313G 8133G 1180G 87.32 1.31 183
21 hdd 9.09560 1.00000 9313G 7374G 1939G 79.17 1.19 166
22 hdd 9.09560 1.00000 9313G 7550G 1763G 81.07 1.22 170
23 hdd 9.09560 1.00000 9313G 7552G 1761G 81.08 1.22 170
24 hdd 9.09560 1.00000 9313G 7955G 1358G 85.42 1.28 179
25 hdd 9.09560 1.00000 9313G 7909G 1404G 84.92 1.28 178
26 hdd 9.09560 1.00000 9313G 7685G 1628G 82.51 1.24 173
27 hdd 9.09560 1.00000 9313G 7284G 2029G 78.21 1.18 164
28 hdd 9.09560 1.00000 9313G 7243G 2070G 77.77 1.17 163
29 hdd 9.09560 1.00000 9313G 7509G 1804G 80.63 1.21 169
30 hdd 9.09560 1.00000 9313G 7065G 2248G 75.86 1.14 159
31 hdd 9.09560 1.00000 9313G 7155G 2158G 76.83 1.16 161
32 hdd 9.09560 1.00000 9313G 6932G 2381G 74.43 1.12 156
33 hdd 9.09560 1.00000 9313G 6756G 2557G 72.54 1.09 152
34 hdd 9.09560 1.00000 9313G 7687G 1626G 82.54 1.24 173
35 hdd 9.09560 1.00000 9313G 6665G 2648G 71.57 1.08 150
36 hdd 9.09560 1.00000 9313G 7954G 1359G 85.41 1.28 179
37 hdd 9.09560 1.00000 9313G 7113G 2199G 76.38 1.15 160
38 hdd 9.09560 1.00000 9313G 7286G 2027G 78.23 1.18 164
39 hdd 9.09560 1.00000 9313G 7198G 2115G 77.28 1.16 162
40 hdd 9.09560 1.00000 9313G 7953G 1360G 85.39 1.28 179
41 hdd 9.09560 1.00000 9313G 6756G 2557G 72.54 1.09 152
42 hdd 9.09560 1.00000 9313G 7241G 2072G 77.75 1.17 163
43 hdd 9.09560 1.00000 9313G 7063G 2250G 75.84 1.14 159
44 hdd 9.09560 1.00000 9313G 7951G 1362G 85.38 1.28 179
45 hdd 9.09560 1.00000 9313G 6708G 2605G 72.03 1.08 151
46 hdd 9.09560 1.00000 9313G 7598G 1715G 81.58 1.23 171
47 hdd 9.09560 1.00000 9313G 7065G 2248G 75.86 1.14 159
48 hdd 9.09560 1.00000 9313G 7868G 1445G 84.48 1.27 177
49 hdd 9.09560 1.00000 9313G 7331G 1982G 78.72 1.18 165
50 hdd 9.09560 1.00000 9313G 7377G 1936G 79.21 1.19 166
51 hdd 9.09560 1.00000 9313G 7065G 2248G 75.86 1.14 159
52 hdd 9.09560 1.00000 9313G 8041G 1272G 86.34 1.30 181
53 hdd 9.09560 1.00000 9313G 7152G 2161G 76.79 1.15 161
54 hdd 9.09560 1.00000 9313G 7505G 1808G 80.58 1.21 169
55 hdd 9.09560 1.00000 9313G 7556G 1757G 81.13 1.22 170
56 hdd 9.09560 1.00000 9313G 6841G 2472G 73.46 1.10 154
57 hdd 9.09560 1.00000 9313G 7598G 1715G 81.58 1.23 171
58 hdd 9.09560 1.00000 9313G 7245G 2068G 77.79 1.17 163
59 hdd 9.09560 1.00000 9313G 7152G 2161G 76.79 1.15 161
60 hdd 9.09560 1.00000 9313G 7864G 1449G 84.44 1.27 177
61 hdd 9.09560 1.00000 9313G 6890G 2423G 73.98 1.11 155
62 hdd 9.09560 1.00000 9313G 6884G 2429G 73.92 1.11 155
63 hdd 9.09560 1.00000 9313G 7776G 1537G 83.49 1.26 175
64 hdd 9.09560 1.00000 9313G 7597G 1716G 81.57 1.23 171
65 hdd 9.09560 1.00000 9313G 6706G 2607G 72.00 1.08 151
66 hdd 9.09560 0.95000 9313G 7820G 1493G 83.97 1.26 176
67 hdd 9.09560 0.95000 9313G 8043G 1270G 86.36 1.30 181
68 hdd 9.09560 1.00000 9313G 7643G 1670G 82.07 1.23 172
69 hdd 9.09560 1.00000 9313G 6620G 2693G 71.08 1.07 149
70 hdd 9.09560 1.00000 9313G 7775G 1538G 83.48 1.26 175
71 hdd 9.09560 1.00000 9313G 7731G 1581G 83.02 1.25 174
72 hdd 9.09560 1.00000 9313G 7598G 1715G 81.58 1.23 171
73 hdd 9.09560 1.00000 9313G 6575G 2738G 70.60 1.06 148
74 hdd 9.09560 1.00000 9313G 7155G 2158G 76.83 1.16 161
75 hdd 9.09560 1.00000 9313G 6220G 3093G 66.79 1.00 140
76 hdd 9.09560 1.00000 9313G 6796G 2517G 72.97 1.10 153
77 hdd 9.09560 1.00000 9313G 7725G 1587G 82.95 1.25 174
78 hdd 9.09560 1.00000 9313G 7241G 2072G 77.75 1.17 163
79 hdd 9.09560 1.00000 9313G 7597G 1716G 81.57 1.23 171
80 hdd 9.09560 1.00000 9313G 7467G 1846G 80.18 1.21 168
81 hdd 9.09560 1.00000 9313G 7909G 1404G 84.92 1.28 178
82 hdd 9.09560 1.00000 9313G 7240G 2073G 77.74 1.17 163
83 hdd 9.09560 1.00000 9313G 7241G 2072G 77.75 1.17 163
84 hdd 9.09560 1.00000 9313G 7687G 1626G 82.54 1.24 173
85 hdd 9.09560 1.00000 9313G 7244G 2069G 77.78 1.17 163
86 hdd 9.09560 1.00000 9313G 7466G 1847G 80.16 1.21 168
87 hdd 9.09560 1.00000 9313G 7953G 1360G 85.39 1.28 179
88 hdd 9.09569 1.00000 9313G 144G 9169G 1.56 0.02 3
89 hdd 9.09569 1.00000 9313G 241G 9072G 2.59 0.04 5
90 hdd 0 1.00000 9313G 6975M 9307G 0.07 0.00 0
91 hdd 0 1.00000 9313G 1854M 9312G 0.02 0 0
92 hdd 0 1.00000 9313G 1837M 9312G 0.02 0 0
93 hdd 0 1.00000 9313G 2001M 9312G 0.02 0 0
94 hdd 0 1.00000 9313G 1829M 9312G 0.02 0 0
95 hdd 0 1.00000 9313G 1807M 9312G 0.02 0 0
96 hdd 0 1.00000 9313G 1850M 9312G 0.02 0 0
97 hdd 0 1.00000 9313G 1311M 9312G 0.01 0 0
98 hdd 0 1.00000 9313G 1287M 9312G 0.01 0 0
99 hdd 0 1.00000 9313G 1279M 9312G 0.01 0 0
100 hdd 0 1.00000 9313G 1285M 9312G 0.01 0 0
101 hdd 0 1.00000 9313G 1271M 9312G 0.01 0 0
On Tue, Mar 27, 2018 at 2:29 PM, Peter Linder <peter.linder@xxxxxxxxxxxxxx> wrote:
I've had similar issues, but I think your problem might be something else. Could you send the output of "ceph osd df"?
Other people will probably be interested in what version you are using as well.
Den 2018-03-27 kl. 20:07, skrev Jon Light:
Hi all,
I'm adding a new OSD node with 36 OSDs to my cluster and have run into some problems. Here are some of the details of the cluster:
1 OSD node with 80 OSDs1 EC pool with k=10, m=3pg_num 1024osd failure domain
I added a second OSD node and started creating OSDs with ceph-deploy, one by one. The first 2 added fine, but each subsequent new OSD resulted in more and more PGs stuck activating. I've added a total of 14 new OSDs, but had to set 12 of those with a weight of 0 to get the cluster healthy and usable until I get it fixed.
I have read some things about similar behavior due to PG overdose protection, but I don't think that's the case here because the failure domain is set to osd. Instead, I think my CRUSH rule need some attention:
rule main-storage {id 1type erasuremin_size 3max_size 13step set_chooseleaf_tries 5step set_choose_tries 100step take default class hddstep choose indep 0 type osdstep emit}
I don't believe I have modified anything from the automatically generated rule except for the addition of the hdd class.
I have been reading the documentation on CRUSH rules, but am having trouble figuring out if the rule is setup properly. After a few more nodes are added I do want to change the failure domain to host, but osd is sufficient for now.
Can anyone help out to see if the rule is causing the problems or if I should be looking at something else?
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/ listinfo.cgi/ceph-users-ceph. com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph. com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com