Re: Understanding filesystem size

"Anthony D'Atri" <anthony.datri@xxxxxxxxx> · Thu, 2 Jan 2025 10:35:44 -0500

Remember that `ceph df` takes into account the full ratio reserved space, and the headroom between that threshold and the most-full OSD.

Run `ceph osd df` and look at the PGs and VAR columns https://www.ibm.com/docs/en/storage-ceph/7?topic=monitoring-understanding-osd-usage-stats

If you have high variability, you may have an issue with the balancer not being enabled or a CRUSH nuance preventing it from working.

If the PGs column is REALLY low you could have a bin-packing phenomenon.

I suspect a balancer issue.

Also, when your OSDs vary a lot in size, it’s best to balance the cumulative weights of each failure domain.  In your case, that’s probably host, so

	ceph osd tree | grep host

and compare the weights.  If your failure domain is host for all CRUSH rules, then I would think that 24-32 TB host weight variance would be not a big problem since there are more than 6+2 of them. If your failure domain is rack, then do the above subbing in rack.

Send your entire ceph osd tree and ceph osd df here if you like, since it’ll be short, and check your CRUSH rules.  

I suspect that you have at least one OSD outlier that is more full than the others.

> On Jan 2, 2025, at 2:56 AM, Nicola Mori <mori@xxxxxxxxxx> wrote:
> 
> Dear Ceph users,
> 
> I'd need some help with CephFS. Originally, my cluster consisted of 12 hosts with different total raw disk size, ranging from 16 to 32 TB; there were two pools, one 3x replicated metadata and one 6+2 EC data with host failure domain, and which I access via CephFS. In total there were 512 PGs.
> 
> Naively, I would expect that the available size of the data pool would be about 75% of the total available space given the EC parameters (6/(6+2) = 75%), but I noticed that it was actually lower, around 66% (as reported by df -h, while for the total raw capacity I used the one reported by the Ceph dashboard). I guessed this could be due to smaller hosts limiting the total amount of available space for the filesystem, so I started a disk upgrade campaign to make the host sizes more even. For example, for a 16 TB host with 8x2TB I replaced two disks with 8 TB ones, for a new raw size of 28 TB. I did this and similar upgrades for other hosts in steps, every time waiting for backfill to finish and checking the new filesystem size before proceeding. And I noticed that the filesystem size reported by df -h always remained at 66% of the total raw capacity, despite host sizes are now all between 24 and 32 TB.
> 
> So my guess about the limiting factor for the relative fs size is evidently wrong. I thought that for some reason maybe 66% is the actual limit and also about some possible explanation (e.g. some space being reserved for the metadata pool), and even considered the possibility that the size reported by df -h being meaningless, but I'd need some advice from an expert to sort this out.
> 
> Thanks in advance for any help.
> 
> Nicola
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx