Re: large difference between "STORED" and "USED" size of ceph df

Janne Johansson <icepic.dz@xxxxxxxxx> · Mon, 4 May 2020 09:29:02 +0200

Den sön 3 maj 2020 kl 13:23 skrev Lee, H. (Hurng-Chun) <h.lee@xxxxxxxxxxxxx
>:

> Hello,
>
> We use purely cephfs in out ceph cluster (version 14.2.7).  The cephfs
> data is an EC pool (k=4, m=2) with hdd OSDs using bluestore. The
>

EC 4+2 == 50% overhead

>
> What triggered my attention is the discrepency between the reported
> size of "USED" (52 TiB) and "STORED" (34 TiB) on the cephfs-data pool.
>

Stored is what your clients fed into it, Used is how much space ceph used
in order
to store that data according to your pool settings, so

34 * 1.5 = 51.0

which seems to be exactly what you asked for with EC=4+2

On top of that, you might see losses in object sizes and smallest possible
usage,
but those would appear on any kind of replication or EC pool, with effects
that
match the data replication and layout strategy.

I don't think 12M objs * 4M really weighs in directly, it might just be a
coincidence that
this sum currently matches the storage overhead. Or inversely, if it DID
match 100%,
you would have zero redundancy in the pool, even though you have set it to
be able
to lose two drives without risking data.

-- 
May the most significant bit of your life be positive.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx