Hi all,
I have setup a test cluster with 3 servers,
Everything has default values with a replication
of 3.
I have created one volume called gds-common
and the data pool has been configured with compression lz4
and compression_mode aggressive.
I have copied 71TB data to this volume but I can
not get my head around usage information on the cluster.
Most of this data is quite small files that contain plain text,
so I expect the compression rate to be quite good.
With both the data storage where I copy from and the ceph fs
mounted a df -h gives:
urd-gds-031:/gds-common 163T 71T 92T
44% /gds-common
10.10.100.0:6789,10.10.100.1:6789,10.10.100.2:6789:/ 92T 68T 25T
74% /ceph-gds-common
Looking at this, the compression rate do not seem to be that good,
or is the used column showing an uncompressed value?
Using ceph and command ceph fs df detail:
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 262 TiB 94 TiB 168 TiB 168 TiB 64.10
TOTAL 262 TiB 94 TiB 168 TiB 168 TiB 64.10
--- POOLS ---
POOL ID PGS STORED (DATA) (OMAP) OBJECTS
USED (DATA) (OMAP) %USED MAX AVAIL QUOTA OBJECTS QUOTA BYTES
DIRTY USED COMPR UNDER COMPR
.mgr 1 1 24 MiB 24 MiB 0 B 8 73
MiB 73 MiB 0 B 0 25 TiB N/A N/A
N/A 0 B 0 B
gds-common_data 2 1024 67 TiB 67 TiB 0 B 23.31M 167
TiB 167 TiB 0 B 69.43 25 TiB N/A N/A
N/A 35 TiB 70 TiB
gds-common_metadata 3 32 4.0 GiB 251 MiB 3.8 GiB 680.88k 12
GiB 753 MiB 11 GiB 0.02 25 TiB N/A N/A
N/A 0 B 0 B
.rgw.root 4 32 1.4 KiB 1.4 KiB 0 B 4 48
KiB 48 KiB 0 B 0 25 TiB N/A N/A
N/A 0 B 0 B
default.rgw.log 5 32 182 B 182 B 0 B 2 24
KiB 24 KiB 0 B 0 25 TiB N/A N/A
N/A 0 B 0 B
default.rgw.control 6 32 0 B 0 B 0 B 7
0 B 0 B 0 B 0 25 TiB N/A N/A
N/A 0 B 0 B
default.rgw.meta 7 32 0 B 0 B 0 B 0
0 B 0 B 0 B 0 25 TiB N/A N/A
N/A 0 B 0 B
From my understanding the raw storage used contain all the 3 copies
so this means 56TB per copy and gives an compression of about 20% if
this is a compressed value?
Looking at the pool gds-common_data value STORED 67TB is an
uncompressed value
and a value per copy, right?
The used value from gds-common_data is the raw usage of all 3 copies,
right?
The %RAW USED value make sense (64.10) but the gds-common_data %USED
differs
(69.43) and I can not figure out what this value relates to?
UNDER COMPR is the amount of data that ceph has recognized that it can
be
used in compression (70TB) so it is about all the data.
I did not understand the value USED COMPR (35TB), do this specify how
much
it has been compressed, so 70TB has been compressed to 35TB?
But what values are specified as compressed and what values shows the
raw uncompressed values?
Are all values uncompressed values and the only place I see compression
is "USED COMPR" and "UNDER COMPR"?
But when do I run out of storage in my cluster then and what value
should I keep my eyes on if %used is calculated on uncompressed data?
Does this mean that I have more storage available then shown from %USED?
Does df -h on a mount shows the uncompressed used value?
Then we have mon_osd_full_ratio does this mean that the first osd
that reaches .95 full (default) make the system stop the clients write
aso?
But does this mon_osd_full_ratio always reaches its limit before
%RAW USAGE reaches 100% or pool %USED reaches 100% or what does
happen if one of the used values reaches 100% before mon_osd_full_ratio?
I am sorry for all the questions but even after reading the documentaion
I do not seem to be able to figure this out.
All help is appreciated.
Many thanks in advance!
Best regards
Marcus
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx