I think this is resolved—and you're right about the 0-weight of the root
bucket being strange. I had created the rack buckets with
# ceph osd crush add-bucket rack-0 rack
whereas I should have used something like
# ceph osd crush add-bucket rack-0 rack root=default
There's a bit in the documentation
(https://docs.ceph.com/en/quincy/rados/operations/crush-map) that says
"Not all keys need to be specified" (in a different context, I admit).
I might have saved a second or two by omitting "root=default" and maybe
half a minute by not checking the CRUSH map carefully afterwards. It
was not worth it.
// J
On 2023-04-05 12:01, ceph@xxxxxxxxxx wrote:
I guess this is related to your crush rules..
Unfortunaly i dont know much about creating the rules...
But someone cloud give more insights when you also provide
crush rule dump
.... your "-1 0 root default" is a bit strange
Am 1. April 2023 01:01:39 MESZ schrieb Johan Hattne <johan@xxxxxxxxx>:
Here goes:
# ceph -s
cluster:
id: e1327a10-8b8c-11ed-88b9-3cecef0e3946
health: HEALTH_OK
services:
mon: 5 daemons, quorum bcgonen-a,bcgonen-b,bcgonen-c,bcgonen-r0h0,bcgonen-r0h1 (age 16h)
mgr: bcgonen-b.furndm(active, since 8d), standbys: bcgonen-a.qmmqxj
mds: 1/1 daemons up, 2 standby
osd: 36 osds: 36 up (since 16h), 36 in (since 3d); 1041 remapped pgs
data:
volumes: 1/1 healthy
pools: 3 pools, 1041 pgs
objects: 5.42M objects, 6.5 TiB
usage: 19 TiB used, 428 TiB / 447 TiB avail
pgs: 27087125/16252275 objects misplaced (166.667%)
1039 active+clean+remapped
2 active+clean+remapped+scrubbing+deep
# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-14 149.02008 rack rack-1
-7 149.02008 host bcgonen-r1h0
20 hdd 14.55269 osd.20 up 1.00000 1.00000
21 hdd 14.55269 osd.21 up 1.00000 1.00000
22 hdd 14.55269 osd.22 up 1.00000 1.00000
23 hdd 14.55269 osd.23 up 1.00000 1.00000
24 hdd 14.55269 osd.24 up 1.00000 1.00000
25 hdd 14.55269 osd.25 up 1.00000 1.00000
26 hdd 14.55269 osd.26 up 1.00000 1.00000
27 hdd 14.55269 osd.27 up 1.00000 1.00000
28 hdd 14.55269 osd.28 up 1.00000 1.00000
29 hdd 14.55269 osd.29 up 1.00000 1.00000
34 ssd 1.74660 osd.34 up 1.00000 1.00000
35 ssd 1.74660 osd.35 up 1.00000 1.00000
-13 298.04016 rack rack-0
-3 149.02008 host bcgonen-r0h0
0 hdd 14.55269 osd.0 up 1.00000 1.00000
1 hdd 14.55269 osd.1 up 1.00000 1.00000
2 hdd 14.55269 osd.2 up 1.00000 1.00000
3 hdd 14.55269 osd.3 up 1.00000 1.00000
4 hdd 14.55269 osd.4 up 1.00000 1.00000
5 hdd 14.55269 osd.5 up 1.00000 1.00000
6 hdd 14.55269 osd.6 up 1.00000 1.00000
7 hdd 14.55269 osd.7 up 1.00000 1.00000
8 hdd 14.55269 osd.8 up 1.00000 1.00000
9 hdd 14.55269 osd.9 up 1.00000 1.00000
30 ssd 1.74660 osd.30 up 1.00000 1.00000
31 ssd 1.74660 osd.31 up 1.00000 1.00000
-5 149.02008 host bcgonen-r0h1
10 hdd 14.55269 osd.10 up 1.00000 1.00000
11 hdd 14.55269 osd.11 up 1.00000 1.00000
12 hdd 14.55269 osd.12 up 1.00000 1.00000
13 hdd 14.55269 osd.13 up 1.00000 1.00000
14 hdd 14.55269 osd.14 up 1.00000 1.00000
15 hdd 14.55269 osd.15 up 1.00000 1.00000
16 hdd 14.55269 osd.16 up 1.00000 1.00000
17 hdd 14.55269 osd.17 up 1.00000 1.00000
18 hdd 14.55269 osd.18 up 1.00000 1.00000
19 hdd 14.55269 osd.19 up 1.00000 1.00000
32 ssd 1.74660 osd.32 up 1.00000 1.00000
33 ssd 1.74660 osd.33 up 1.00000 1.00000
-1 0 root default
# ceph osd pool ls detail
pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 31 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr
pool 2 'cephfs.cephfs.meta' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on last_change 9833 lfor 0/0/584 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs
pool 3 'cephfs.cephfs.data' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 1024 pgp_num 1024 autoscale_mode on last_change 7630 lfor 0/1831/6544 flags hashpspool,bulk stripe_width 0 application cephfs
crush_rules 1 and 2 are just used to assign the data and meta pool to HDD and SSD, respectively (failure domain: host).
// J
On 2023-03-31 15:37, ceph@xxxxxxxxxx wrote:
Need to know some more about your cluster...
Ceph -s
Ceph osd df tree
Replica or ec?
...
Perhaps this can give us some insight
Mehmet
Am 31. März 2023 18:08:38 MESZ schrieb Johan Hattne
<johan@xxxxxxxxx>:
Dear all;
Up until a few hours ago, I had a seemingly normally-behaving
cluster (Quincy, 17.2.5) with 36 OSDs, evenly distributed across
3 of its 6 nodes. The cluster is only used for CephFS and the
only non-standard configuration I can think of is that I had 2
active MDSs, but only 1 standby. I had also doubled
mds_cache_memory limit to 8 GB (all OSD hosts have 256 G of RAM)
at some point in the past.
Then I rebooted one of the OSD nodes. The rebooted node held one
of the active MDSs. Now the node is back up: ceph -s says the
cluster is healthy, but all PGs are in a active+clean+remapped
state and 166.67% of the objects are misplaced (dashboard:
-66.66% healthy).
The data pool is a threefold replica with 5.4M object, the
number of misplaced objects is reported as 27087410/16252446.
The denominator in the ratio makes sense to me (16.2M / 3 =
5.4M), but the numerator does not. I also note that the ratio is
*exactly* 5 / 3. The filesystem is still mounted and appears to
be usable, but df reports it as 100% full; I suspect it would
say 167% but that is capped somewhere.
Any ideas about what is going on? Any suggestions for recovery?
// Best wishes; Johan
------------------------------------------------------------------------
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx