Problem is now solved, the cluster is now backfilling/recovering normally, no more NEAR FULL OSD.
It turns out that I have RBD objects that should have been deleted long time ago but it's still there. Openstack Glance did not removed it, I think it's an issue with snapshots, an RBD file can't be deleted unless it's snapshots are purged. So I compared all my glance images to the RBD counterpart and identified which are not there and deleted them.
So from 81% utilization I am down to 61%.
---
[root@controller-node opt]# ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
100553G 39118G 61435G 61.10
POOLS:
NAME ID USED %USED OBJECTS
images 4 1764G 1.76 225978
volumes 5 18533G 18.43 4762609
[root@controller-node opt]#
---
On Sat, Feb 20, 2016 at 5:38 AM, Lionel Bouton <lionel-subscription@xxxxxxxxxxx> wrote:
Le 19/02/2016 17:17, Don Laursen a écrit :
Thanks. To summarize
Your data, images+volumes = 27.15% space used
Raw used = 81.71% used
This is a big difference that I can’t account for? Can anyone? So is your cluster actually full?
I believe this is the pool size being accounted for and it is harmless: 3 x 27.15 = 81.45 which is awfully close to 81.71.
We have the same behavior on our Ceph cluster.
I had the same problem with my small cluster. Raw used was about 85% and actual data, with replication, was about 30%. My OSDs were also BRTFS. BRTFS was causing its own problems. I fixed my problem by removing each OSD one at a time and re-adding as the default XFS filesystem. Doing so brought the percentages used to be about the same and it’s good now.
That's odd : AFAIK we had the same behaviour with XFS before migrating to BTRFS.
Best regards,
Lionel
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
ᐧ
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com