Hi David,
Apologies for the late response.
NodeB is mon+client, nodeC is client:
Cheph health details:
HEALTH_ERR 819 pgs are stuck inactive for more than 300 seconds; 883 pgs degraded; 64 pgs stale; 819 pgs stuck inactive; 1064 pgs stuck unclean; 883 pgs undersized; 22 requests are blocked > 32 sec; 3 osds have slow requests; recovery 2/8 objects degraded (25.000%); recovery 2/8 objects misplaced (25.000%); crush map has legacy tunables (require argonaut, min is firefly); crush map has straw_calc_version=0
pg 2.fc is stuck inactive since forever, current state undersized+degraded+peered, last acting [2]
pg 2.fd is stuck inactive since forever, current state undersized+degraded+peered, last acting [0]
pg 2.fe is stuck inactive since forever, current state undersized+degraded+peered, last acting [2]
pg 2.ff is stuck inactive since forever, current state undersized+degraded+peered, last acting [1]
pg 1.fb is stuck inactive for 493857.572982, current state undersized+degraded+peered, last acting [4]
pg 2.f8 is stuck inactive since forever, current state undersized+degraded+peered, last acting [3]
pg 1.fa is stuck inactive for 492185.443146, current state undersized+degraded+peered, last acting [0]
pg 2.f9 is stuck inactive since forever, current state undersized+degraded+peered, last acting [0]
pg 1.f9 is stuck inactive for 492185.452890, current state undersized+degraded+peered, last acting [2]
pg 2.fa is stuck inactive since forever, current state undersized+degraded+peered, last acting [3]
pg 1.f8 is stuck inactive for 492185.443324, current state undersized+degraded+peered, last acting [0]
pg 2.fb is stuck inactive since forever, current state undersized+degraded+peered, last acting [2]
.
.
.
pg 1.fb is undersized+degraded+peered, acting [4]
pg 2.ff is undersized+degraded+peered, acting [1]
pg 2.fe is undersized+degraded+peered, acting [2]
pg 2.fd is undersized+degraded+peered, acting [0]
pg 2.fc is undersized+degraded+peered, acting [2]
3 ops are blocked > 536871 sec on osd.4
15 ops are blocked > 268435 sec on osd.4
1 ops are blocked > 262.144 sec on osd.4
2 ops are blocked > 268435 sec on osd.3
1 ops are blocked > 268435 sec on osd.1
3 osds have slow requests
recovery 2/8 objects degraded (25.000%)
recovery 2/8 objects misplaced (25.000%)
ceph osd stat
cluster-admin@nodeB:~/.ssh/ceph-cluster$ cat ceph_osd_stat.txt
osdmap e80: 10 osds: 5 up, 5 in; 558 remapped pgs
flags sortbitwise
ceph osd tree:
cluster-admin@nodeB:~/.ssh/ceph-cluster$ ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 9.08691 root default
-2 4.54346 host nodeB
5 0.90869 osd.5 down 0 1.00000
6 0.90869 osd.6 down 0 1.00000
7 0.90869 osd.7 down 0 1.00000
8 0.90869 osd.8 down 0 1.00000
9 0.90869 osd.9 down 0 1.00000
-3 4.54346 host nodeC
0 0.90869 osd.0 up 1.00000 1.00000
1 0.90869 osd.1 up 1.00000 1.00000
2 0.90869 osd.2 up 1.00000 1.00000
3 0.90869 osd.3 up 1.00000 1.00000
4 0.90869 osd.4 up 1.00000 1.00000
CrushMap:
# begin crush map
# devices
device 0 osd.0
device 1 osd.1
device 2 osd.2
device 3 osd.3
device 4 osd.4
device 5 osd.5
device 6 osd.6
device 7 osd.7
device 8 osd.8
device 9 osd.9
# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root
# buckets
host nodeB {
id -2 # do not change unnecessarily
# weight 4.543
alg straw
hash 0 # rjenkins1
item osd.5 weight 0.909
item osd.6 weight 0.909
item osd.7 weight 0.909
item osd.8 weight 0.909
item osd.9 weight 0.909
}
host nodeC {
id -3 # do not change unnecessarily
# weight 4.543
alg straw
hash 0 # rjenkins1
item osd.0 weight 0.909
item osd.1 weight 0.909
item osd.2 weight 0.909
item osd.3 weight 0.909
item osd.4 weight 0.909
}
root default {
id -1 # do not change unnecessarily
# weight 9.087
alg straw
hash 0 # rjenkins1
item nodeB weight 4.543
item nodeC weight 4.543
}
# rules
rule replicated_ruleset {
ruleset 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
# end crush map
ceph.conf
cluster-admin@nodeB:~/.ssh/ceph-cluster$ cat /etc/ceph/ceph.conf
[global]
fsid = a04e9846-6c54-48ee-b26f-d6949d8bacb4
mon_initial_members = nodeB
mon_host = <mon IP>
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
public_network = X.X.X.0/24