I have a completely new cluster for testing and its three servers which
all are monitors and hosts for OSD, they each have one disk.
The issue is ceph status shows: 64 stale+undersized+degraded+peered
health:
health HEALTH_WARN
clock skew detected on mon.ceph01-osd03
64 pgs degraded
64 pgs stale
64 pgs stuck degraded
64 pgs stuck inactive
64 pgs stuck stale
64 pgs stuck unclean
64 pgs stuck undersized
64 pgs undersized
too few PGs per OSD (21 < min 30)
Monitor clock skew detected
monmap e1: 3 mons at
{ceph01-osd01=192.1.41.51:6789/0,ceph01-osd02=192.1.41.52:6789/0,ceph01-osd03=192.1.41.53:6789/0}
election epoch 82, quorum 0,1,2
ceph01-osd01,ceph01-osd02,ceph01-osd03
osdmap e36: 3 osds: 3 up, 3 in
pgmap v85: 64 pgs, 1 pools, 0 bytes data, 0 objects
101352 kB used, 8365 GB / 8365 GB avail
64 stale+undersized+degraded+peered
ceph osd tree shows:
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 8.15996 root default
-2 2.71999 host ceph01-osd01
0 2.71999 osd.0 up 1.00000 1.00000
-3 2.71999 host ceph01-osd02
1 2.71999 osd.1 up 1.00000 1.00000
-4 2.71999 host ceph01-osd03
2 2.71999 osd.2 up 1.00000 1.00000
Here is my crushmap:
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable straw_calc_version 1
# devices
device 0 osd.0
device 1 osd.1
device 2 osd.2
# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root
# buckets
host ceph01-osd01 {
id -2 # do not change unnecessarily
# weight 2.720
alg straw
hash 0 # rjenkins1
item osd.0 weight 2.720
}
host ceph01-osd02 {
id -3 # do not change unnecessarily
# weight 2.720
alg straw
hash 0 # rjenkins1
item osd.1 weight 2.720
}
host ceph01-osd03 {
id -4 # do not change unnecessarily
# weight 2.720
alg straw
hash 0 # rjenkins1
item osd.2 weight 2.720
}
root default {
id -1 # do not change unnecessarily
# weight 8.160
alg straw
hash 0 # rjenkins1
item ceph01-osd01 weight 2.720
item ceph01-osd02 weight 2.720
item ceph01-osd03 weight 2.720
}
# rules
rule replicated_ruleset {
ruleset 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
# end crush map
And the ceph.conf which is shared among all nodes:
ceph.conf
[global]
fsid = b9043917-5f65-98d5-8624-ee12ff32a5ea
public_network = 192.1.41.0/24
cluster_network = 192.168.0.0/24
mon_initial_members = ceph01-osd01, ceph01-osd02, ceph01-osd03
mon_host = 192.1.41.51,192.1.41.52,192.1.41.53
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true
osd pool default pg num = 512
osd pool default pgp num = 512
Logs doesnt say much, the only active log which adds something is:
mon.ceph01-osd01@0(leader).data_health(82) update_stats avail 88% total
9990 MB, used 1170 MB, avail 8819 MB
mon.ceph01-osd02@1(peon).data_health(82) update_stats avail 88% total
9990 MB, used 1171 MB, avail 8818 MB
mon.ceph01-osd03@2(peon).data_health(82) update_stats avail 88% total
9990 MB, used 1172 MB, avail 8817 MB
Does anyone have a thoughts of what might be wrong? Or if there is other
info I can provide to ease the search for what it might be?
Thanks!
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com