Re: The confusing output of ceph df command

Igor Fedotov <ifedotov@xxxxxxx> · Wed, 9 Sep 2020 14:17:18 +0300

Hi Norman,

not pretending to know the exact root cause but IMO one of the working 
hypothesis might be as follows :

Presuming spinners as backing devices for you OSDs and hence 64K 
allocation unit (bluestore min_alloc_size_hdd param).

1) 1.48GB user objects result in 1.48G * 6 = 8.88G EC shards.

2) Shards tend to be unaligned with 64K allocation unit which might 
result in an average loss of 32K per each shard.

3) Hence total loss due to allocation overhead to be estimated at 32K * 
8.88G = 284T  which looks close enough to your numbers for default-fs-data0:

939TiB - 374 TiB / 4 * 6 = 378 TiB of space loss.

Additional issue which might result in the space loss is space 
amplification occurred caused by partial unaligned overwrites to objects 
in EC pool. See my post "Root cause analysis for space overhead with 
erasure coded pools." to dev@xxxxxxx mailing list on Jan 23.

Migrating to 4K min alloc size seems to be the only known way to fix (or 
rather workaround) these issues. Upcoming Pacific release is gonna to 
bring downsizing to 4K (for new OSD deployments) along with some 
additional changes to smooth corresponding negative performance impacts.

Hope this helps.

Igor

On 9/9/2020 2:30 AM, norman kern wrote:
Hi,

I have changed most of pools from 3-replica to ec 4+2 in my cluster, 
when I use ceph df command to show

the used capactiy of the cluster:

RAW STORAGE:
    CLASS         SIZE        AVAIL       USED        RAW USED     
%RAW USED
    hdd           1.8 PiB     788 TiB     1.0 PiB      1.0 PiB         
57.22
    ssd           7.9 TiB     4.6 TiB     181 GiB      3.2 TiB         
41.15
    ssd-cache     5.2 TiB     5.2 TiB      67 GiB       73 
GiB          1.36
    TOTAL         1.8 PiB     798 TiB     1.0 PiB      1.0 PiB         
56.99

POOLS:
    POOL                                ID     STORED OBJECTS     
USED        %USED     MAX AVAIL
    default-oss.rgw.control             1         0 B           
8         0 B         0       1.3 TiB
    default-oss.rgw.meta                2      22 KiB          97     
3.9 MiB         0       1.3 TiB
    default-oss.rgw.log                 3     525 KiB         223     
621 KiB         0       1.3 TiB
    default-oss.rgw.buckets.index       4      33 MiB          34      
33 MiB         0       1.3 TiB
    default-oss.rgw.buckets.non-ec      5     1.6 MiB          48     
3.8 MiB         0       1.3 TiB
    .rgw.root                            6     3.8 KiB          16     
720 KiB         0       1.3 TiB
    default-oss.rgw.buckets.data        7     274 GiB 185.39k     450 
GiB      0.14       212 TiB
    default-fs-metadata                 8     488 GiB 153.10M     490 
GiB     10.65       1.3 TiB
    default-fs-data0                    9     374 TiB 1.48G     939 
TiB     74.71       212 TiB

   ...

The USED = 3 * STORED in 3-replica mode is completely right, but for 
EC 4+2 pool (for default-fs-data0 )

the USED is not equal 1.5 * STORED, why...:(

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx