Hello all!
I need some help with my Ceph cluster.
I've installed ceph cluster with two physical servers with osd /data 40G on each.
Here is ceph.conf:
[global]
fsid = 377174ff-f11f-48ec-ad8b-ff450d43391c
mon_initial_members = vm35, vm36
mon_host = 192.168.1.35,192.168.1.36
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
osd pool default size = 2 # Write an object 2 times.
osd pool default min size = 1 # Allow writing one copy in a degraded state.
osd pool default pg num = 200
osd pool default pgp num = 200
fsid = 377174ff-f11f-48ec-ad8b-ff450d43391c
mon_initial_members = vm35, vm36
mon_host = 192.168.1.35,192.168.1.36
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
osd pool default size = 2 # Write an object 2 times.
osd pool default min size = 1 # Allow writing one copy in a degraded state.
osd pool default pg num = 200
osd pool default pgp num = 200
Right after creation it was HEALTH_OK, and i've started with filling it. I've wrote 40G data to cluster using Rados gateway, but cluster uses all avaiable space and keep growing after i've added two another osd - 10G /data1 on each server.
Here is tree output:
# ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 0.09756 root default
-2 0.04878 host vm35
0 0.03899 osd.0 up 1.00000 1.00000
2 0.00980 osd.2 up 1.00000 1.00000
-3 0.04878 host vm36
1 0.03899 osd.1 up 1.00000 1.00000
3 0.00980 osd.3 up 1.00000 1.00000
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 0.09756 root default
-2 0.04878 host vm35
0 0.03899 osd.0 up 1.00000 1.00000
2 0.00980 osd.2 up 1.00000 1.00000
-3 0.04878 host vm36
1 0.03899 osd.1 up 1.00000 1.00000
3 0.00980 osd.3 up 1.00000 1.00000
and health:
root@vm35:/etc# ceph health
HEALTH_ERR 5 pgs backfill_toofull; 15 pgs degraded; 16 pgs stuck unclean; 15 pgs undersized; recovery 87176/300483 objects degraded (29.012%); recovery 62272/300483 obj
ects misplaced (20.724%); 1 full osd(s); 2 near full osd(s); pool default.rgw.buckets.data has many more objects per pg than average (too few pgs?)
root@vm35:/etc# ceph health detail
HEALTH_ERR 5 pgs backfill_toofull; 15 pgs degraded; 16 pgs stuck unclean; 15 pgs undersized; recovery 87176/300483 objects degraded (29.012%); recovery 62272/300483 obj
ects misplaced (20.724%); 1 full osd(s); 2 near full osd(s); pool default.rgw.buckets.data has many more objects per pg than average (too few pgs?)
pg 10.5 is stuck unclean since forever, current state active+undersized+degraded, last acting [1,0]
pg 9.6 is stuck unclean since forever, current state active+undersized+degraded+remapped+backfill_toofull, last acting [1,0]
pg 10.4 is stuck unclean since forever, current state active+remapped, last acting [3,0,1]
pg 9.7 is stuck unclean since forever, current state active+undersized+degraded+remapped+backfill_toofull, last acting [1,0]
pg 10.7 is stuck unclean since forever, current state active+undersized+degraded+remapped+backfill_toofull, last acting [0,1]
pg 9.4 is stuck unclean since forever, current state active+undersized+degraded, last acting [1,0]
pg 9.1 is stuck unclean since forever, current state active+undersized+degraded, last acting [0,3]
pg 10.2 is stuck unclean since forever, current state active+undersized+degraded, last acting [1,0]
pg 9.0 is stuck unclean since forever, current state active+undersized+degraded, last acting [1,2]
pg 10.3 is stuck unclean since forever, current state active+undersized+degraded, last acting [2,1]
pg 9.3 is stuck unclean since forever, current state active+undersized+degraded+remapped+backfill_toofull, last acting [1,0]
pg 10.0 is stuck unclean since forever, current state active+undersized+degraded+remapped+backfill_toofull, last acting [1,0]
pg 9.2 is stuck unclean since forever, current state active+undersized+degraded, last acting [0,1]
pg 10.1 is stuck unclean since forever, current state active+undersized+degraded, last acting [0,1]
pg 9.5 is stuck unclean since forever, current state active+undersized+degraded, last acting [1,0]
pg 10.6 is stuck unclean since forever, current state active+undersized+degraded, last acting [0,1]
pg 9.1 is active+undersized+degraded, acting [0,3]
pg 10.2 is active+undersized+degraded, acting [1,0]
pg 9.0 is active+undersized+degraded, acting [1,2]
pg 10.3 is active+undersized+degraded, acting [2,1]
pg 9.3 is active+undersized+degraded+remapped+backfill_toofull, acting [1,0]
pg 10.0 is active+undersized+degraded+remapped+backfill_toofull, acting [1,0]
pg 9.2 is active+undersized+degraded, acting [0,1]
pg 10.1 is active+undersized+degraded, acting [0,1]
pg 9.5 is active+undersized+degraded, acting [1,0]
pg 10.6 is active+undersized+degraded, acting [0,1]
pg 9.4 is active+undersized+degraded, acting [1,0]
pg 10.7 is active+undersized+degraded+remapped+backfill_toofull, acting [0,1]
pg 9.7 is active+undersized+degraded+remapped+backfill_toofull, acting [1,0]
pg 9.6 is active+undersized+degraded+remapped+backfill_toofull, acting [1,0]
pg 10.5 is active+undersized+degraded, acting [1,0]
recovery 87176/300483 objects degraded (29.012%)
recovery 62272/300483 objects misplaced (20.724%)
osd.1 is full at 95%
osd.2 is near full at 91%
osd.3 is near full at 91%
pool default.rgw.buckets.data objects per pg (12438) is more than 17.8451 times cluster average (697)
HEALTH_ERR 5 pgs backfill_toofull; 15 pgs degraded; 16 pgs stuck unclean; 15 pgs undersized; recovery 87176/300483 objects degraded (29.012%); recovery 62272/300483 obj
ects misplaced (20.724%); 1 full osd(s); 2 near full osd(s); pool default.rgw.buckets.data has many more objects per pg than average (too few pgs?)
root@vm35:/etc# ceph health detail
HEALTH_ERR 5 pgs backfill_toofull; 15 pgs degraded; 16 pgs stuck unclean; 15 pgs undersized; recovery 87176/300483 objects degraded (29.012%); recovery 62272/300483 obj
ects misplaced (20.724%); 1 full osd(s); 2 near full osd(s); pool default.rgw.buckets.data has many more objects per pg than average (too few pgs?)
pg 10.5 is stuck unclean since forever, current state active+undersized+degraded, last acting [1,0]
pg 9.6 is stuck unclean since forever, current state active+undersized+degraded+remapped+backfill_toofull, last acting [1,0]
pg 10.4 is stuck unclean since forever, current state active+remapped, last acting [3,0,1]
pg 9.7 is stuck unclean since forever, current state active+undersized+degraded+remapped+backfill_toofull, last acting [1,0]
pg 10.7 is stuck unclean since forever, current state active+undersized+degraded+remapped+backfill_toofull, last acting [0,1]
pg 9.4 is stuck unclean since forever, current state active+undersized+degraded, last acting [1,0]
pg 9.1 is stuck unclean since forever, current state active+undersized+degraded, last acting [0,3]
pg 10.2 is stuck unclean since forever, current state active+undersized+degraded, last acting [1,0]
pg 9.0 is stuck unclean since forever, current state active+undersized+degraded, last acting [1,2]
pg 10.3 is stuck unclean since forever, current state active+undersized+degraded, last acting [2,1]
pg 9.3 is stuck unclean since forever, current state active+undersized+degraded+remapped+backfill_toofull, last acting [1,0]
pg 10.0 is stuck unclean since forever, current state active+undersized+degraded+remapped+backfill_toofull, last acting [1,0]
pg 9.2 is stuck unclean since forever, current state active+undersized+degraded, last acting [0,1]
pg 10.1 is stuck unclean since forever, current state active+undersized+degraded, last acting [0,1]
pg 9.5 is stuck unclean since forever, current state active+undersized+degraded, last acting [1,0]
pg 10.6 is stuck unclean since forever, current state active+undersized+degraded, last acting [0,1]
pg 9.1 is active+undersized+degraded, acting [0,3]
pg 10.2 is active+undersized+degraded, acting [1,0]
pg 9.0 is active+undersized+degraded, acting [1,2]
pg 10.3 is active+undersized+degraded, acting [2,1]
pg 9.3 is active+undersized+degraded+remapped+backfill_toofull, acting [1,0]
pg 10.0 is active+undersized+degraded+remapped+backfill_toofull, acting [1,0]
pg 9.2 is active+undersized+degraded, acting [0,1]
pg 10.1 is active+undersized+degraded, acting [0,1]
pg 9.5 is active+undersized+degraded, acting [1,0]
pg 10.6 is active+undersized+degraded, acting [0,1]
pg 9.4 is active+undersized+degraded, acting [1,0]
pg 10.7 is active+undersized+degraded+remapped+backfill_toofull, acting [0,1]
pg 9.7 is active+undersized+degraded+remapped+backfill_toofull, acting [1,0]
pg 9.6 is active+undersized+degraded+remapped+backfill_toofull, acting [1,0]
pg 10.5 is active+undersized+degraded, acting [1,0]
recovery 87176/300483 objects degraded (29.012%)
recovery 62272/300483 objects misplaced (20.724%)
osd.1 is full at 95%
osd.2 is near full at 91%
osd.3 is near full at 91%
pool default.rgw.buckets.data objects per pg (12438) is more than 17.8451 times cluster average (697)
In log i see this:
2016-09-26 10:37:21.688849 mon.0 192.168.1.35:6789/0 4836 : cluster [INF] pgmap v8364: 144 pgs: 5 active+undersized+degraded+remapped+backfill_toofull, 1 active+remapped,
128 active+clean, 10 active+undersized+degraded; 33090 MB data, 92431 MB used, 9908 MB / 102340 MB avail; 87176/300483 objects degraded (29.012%); 62272/300483 objects
misplaced (20.724%)
2016-09-26 10:37:22.192322 osd.3 192.168.1.36:6804/3840 11 : cluster [WRN] OSD near full (91%)
2016-09-26 10:37:38.295580 osd.1 192.168.1.36:6800/4014 16 : cluster [WRN] OSD near full (95%)
128 active+clean, 10 active+undersized+degraded; 33090 MB data, 92431 MB used, 9908 MB / 102340 MB avail; 87176/300483 objects degraded (29.012%); 62272/300483 objects
misplaced (20.724%)
2016-09-26 10:37:22.192322 osd.3 192.168.1.36:6804/3840 11 : cluster [WRN] OSD near full (91%)
2016-09-26 10:37:38.295580 osd.1 192.168.1.36:6800/4014 16 : cluster [WRN] OSD near full (95%)
How can i solve this issue? Why is my cluster using much more space than i fill (I've wrote 40G with two replica's, so i expect that cluster will use 80G data)
What am i doing wrong?
Thank you!
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com