cephfs full, 2/3 Raw capacity used

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

we're building up our experience with our ceph cluster before we take it into production. I've now tried to fill up the cluster with cephfs, which we plan to use for about 95% of all data on the cluster.

The cephfs pools are full when the cluster reports 67% raw capacity used. There are 4 pools we use for cephfs data, 3-copy, 4-copy, EC 8+3 and EC 5+7. The balancer module is turned on and `ceph balancer eval` gives `current cluster score 0.013255 (lower is better)`, so well within the default 5% margin. Is there a setting we can tweak to increase the usable RAW capacity to say 85% or 90%, or is this the most we can expect to store on the cluster?

[root@cephmon1 ~]# ceph df
RAW STORAGE:
    CLASS     SIZE        AVAIL       USED        RAW USED     %RAW USED
    hdd       1.8 PiB     605 TiB     1.2 PiB      1.2 PiB         66.71
    TOTAL     1.8 PiB     605 TiB     1.2 PiB      1.2 PiB         66.71

POOLS:
POOL ID STORED OBJECTS USED %USED MAX AVAIL cephfs_data 1 111 MiB 79.26M 1.2 GiB 100.00 0 B cephfs_metadata 2 52 GiB 4.91M 52 GiB 100.00 0 B cephfs_data_4copy 3 106 TiB 46.36M 428 TiB 100.00 0 B cephfs_data_3copy 8 93 TiB 42.08M 282 TiB 100.00 0 B cephfs_data_ec83 13 106 TiB 50.11M 161 TiB 100.00 0 B rbd 14 21 GiB 5.62k 63 GiB 100.00 0 B .rgw.root 15 1.2 KiB 4 1 MiB 100.00 0 B default.rgw.control 16 0 B 8 0 B 0 0 B default.rgw.meta 17 765 B 4 1 MiB 100.00 0 B default.rgw.log 18 0 B 207 0 B 0 0 B scbench 19 133 GiB 34.14k 400 GiB 100.00 0 B cephfs_data_ec57 20 126 TiB 51.84M 320 TiB 100.00 0 B
[root@cephmon1 ~]# ceph balancer eval
current cluster score 0.013255 (lower is better)


Being full at 2/3 Raw used is a bit too "pretty" to be accidental, it seems like this could be a parameter for cephfs, however, I couldn't find anything like this in the documentation for Nautilus.


The logs in the dashboard show this:
2019-08-26 11:00:00.000630
[ERR]
overall HEALTH_ERR 3 backfillfull osd(s); 1 full osd(s); 12 pool(s) full

2019-08-26 10:57:44.539964
[INF]
Health check cleared: POOL_BACKFILLFULL (was: 12 pool(s) backfillfull)

2019-08-26 10:57:44.539944
[WRN]
Health check failed: 12 pool(s) full (POOL_FULL)

2019-08-26 10:57:44.539926
[ERR]
Health check failed: 1 full osd(s) (OSD_FULL)

2019-08-26 10:57:44.539899
[WRN]
Health check update: 3 backfillfull osd(s) (OSD_BACKFILLFULL)

2019-08-26 10:00:00.000088
[WRN]
overall HEALTH_WARN 4 backfillfull osd(s); 12 pool(s) backfillfull

So it seems that ceph is completely stuck at 2/3 full, while we anticipated being able to fill up the cluster to at least 85-90% of the raw capacity. Or at least so that we would keep a functioning cluster when we have a single osd node fail.

Cheers

/Simon
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux