Hi Igor,
Many thanks for your reply. Here are the
details about the cluster:
1. Ceph version - 13.2.5-1xenial (installed
from Ceph repository for ubuntu 16.04)
2. main devices for radosgw pool - hdd. we
do use a few ssds for the other pool, but it
is not used by radosgw
3. we use BlueStore
4. Average rgw object size - I have no idea
how to check that. Couldn't find a simple
answer from google either. Could you please
let me know how to check that?
5. Ceph osd df tree:
6. Other useful info on the cluster:
# ceph osd df tree
ID CLASS WEIGHT REWEIGHT SIZE USE
AVAIL %USE VAR PGS TYPE NAME
-1 112.17979 - 113 TiB 90
TiB 23 TiB 79.25 1.00 - root uk
-5 112.17979 - 113 TiB 90
TiB 23 TiB 79.25 1.00 - datacenter
ldex
-11 112.17979 - 113 TiB 90
TiB 23 TiB 79.25 1.00 - room
ldex-dc3
-13 112.17979 - 113 TiB 90
TiB 23 TiB 79.25 1.00 - row
row-a
-4 112.17979 - 113 TiB 90
TiB 23 TiB 79.25 1.00 -
rack ldex-rack-a5
-2 28.04495 - 28 TiB 22
TiB 6.2 TiB 77.96 0.98 -
host arh-ibstorage1-ib
0 hdd 2.73000 0.79999 2.8 TiB 2.3
TiB 519 GiB 81.61 1.03 145
osd.0
1 hdd 2.73000 1.00000 2.8 TiB 1.9
TiB 847 GiB 70.00 0.88 130
osd.1
2 hdd 2.73000 1.00000 2.8 TiB 2.2
TiB 561 GiB 80.12 1.01 152
osd.2
3 hdd 2.73000 1.00000 2.8 TiB
2.3 TiB 469 GiB 83.41 1.05 160
osd.3
4 hdd 2.73000 1.00000 2.8 TiB
1.8 TiB 983 GiB 65.18 0.82 141
osd.4
32 hdd 5.45999 1.00000 5.5 TiB
4.4 TiB 1.1 TiB 80.68 1.02 306
osd.32
35 hdd 2.73000 1.00000 2.8 TiB
1.7 TiB 1.0 TiB 62.89 0.79 126
osd.35
36 hdd 2.73000 1.00000 2.8 TiB
2.3 TiB 464 GiB 83.58 1.05 175
osd.36
37 hdd 2.73000 0.89999 2.8 TiB
2.5 TiB 301 GiB 89.34 1.13 160
osd.37
5 ssd 0.74500 1.00000 745 GiB
642 GiB 103 GiB 86.15 1.09 65
osd.5
-3 28.04495 - 28 TiB
24 TiB 4.5 TiB 84.03 1.06 -
host arh-ibstorage2-ib
9 hdd 2.73000 0.95000 2.8 TiB
2.4 TiB 405 GiB 85.65 1.08 158
osd.9
10 hdd 2.73000 0.89999 2.8 TiB
2.4 TiB 352 GiB 87.52 1.10 169
osd.10
11 hdd 2.73000 1.00000 2.8 TiB
2.0 TiB 783 GiB 72.28 0.91 160
osd.11
12 hdd 2.73000 0.84999 2.8 TiB
2.4 TiB 359 GiB 87.27 1.10 153
osd.12
13 hdd 2.73000 1.00000 2.8 TiB
2.4 TiB 348 GiB 87.69 1.11 169
osd.13
14 hdd 2.73000 1.00000 2.8 TiB
2.5 TiB 283 GiB 89.97 1.14 170
osd.14
15 hdd 2.73000 1.00000 2.8 TiB
2.2 TiB 560 GiB 80.18 1.01 155
osd.15
16 hdd 2.73000 0.95000 2.8 TiB
2.4 TiB 332 GiB 88.26 1.11 178
osd.16
26 hdd 5.45999 1.00000 5.5 TiB
4.4 TiB 1.0 TiB 81.04 1.02 324
osd.26
7 ssd 0.74500 1.00000 745 GiB
607 GiB 138 GiB 81.48 1.03 62
osd.7
-15 28.04495 - 28 TiB
22 TiB 6.4 TiB 77.40 0.98 -
host arh-ibstorage3-ib
18 hdd 2.73000 0.95000 2.8 TiB
2.5 TiB 312 GiB 88.96 1.12 156
osd.18
19 hdd 2.73000 1.00000 2.8 TiB
2.0 TiB 771 GiB 72.68 0.92 162
osd.19
20 hdd 2.73000 1.00000 2.8 TiB
2.0 TiB 733 GiB 74.04 0.93 149
osd.20
21 hdd 2.73000 1.00000 2.8
TiB 2.2 TiB 533 GiB 81.12 1.02 155
osd.21
22 hdd 2.73000 1.00000 2.8
TiB 2.1 TiB 692 GiB 75.48 0.95 144
osd.22
23 hdd 2.73000 1.00000 2.8
TiB 1.6 TiB 1.1 TiB 58.43 0.74 130
osd.23
24 hdd 2.73000 1.00000 2.8
TiB 2.2 TiB 579 GiB 79.51 1.00 146
osd.24
25 hdd 2.73000 1.00000 2.8
TiB 1.9 TiB 886 GiB 68.63 0.87 147
osd.25
31 hdd 5.45999 1.00000 5.5
TiB 4.7 TiB 758 GiB 86.50 1.09 326
osd.31
6 ssd 0.74500 0.89999 744
GiB 640 GiB 104 GiB 86.01 1.09 61
osd.6
-17 28.04494 - 28
TiB 22 TiB 6.3 TiB 77.61 0.98 -
host arh-ibstorage4-ib
8 hdd 2.73000 1.00000 2.8
TiB 1.9 TiB 909 GiB 67.80 0.86 141
osd.8
17 hdd 2.73000 1.00000 2.8
TiB 1.9 TiB 904 GiB 67.99 0.86 144
osd.17
27 hdd 2.73000 1.00000 2.8
TiB 2.1 TiB 654 GiB 76.84 0.97 152
osd.27
28 hdd 2.73000 1.00000 2.8
TiB 2.3 TiB 481 GiB 82.98 1.05 153
osd.28
29 hdd 2.73000 1.00000 2.8
TiB 1.9 TiB 829 GiB 70.65 0.89 137
osd.29
30 hdd 2.73000 1.00000 2.8
TiB 2.0 TiB 762 GiB 73.03 0.92 142
osd.30
33 hdd 2.73000 1.00000 2.8
TiB 2.3 TiB 501 GiB 82.25 1.04 166
osd.33
34 hdd 5.45998 1.00000 5.5
TiB 4.5 TiB 968 GiB 82.77 1.04 325
osd.34
39 hdd 2.73000 0.95000 2.8
TiB 2.4 TiB 402 GiB 85.77 1.08 162
osd.39
38 ssd 0.74500 1.00000 745
GiB 671 GiB 74 GiB 90.02 1.14 68
osd.38
TOTAL 113
TiB 90 TiB 23 TiB 79.25
MIN/MAX VAR: 0.74/1.14 STDDEV:
8.14
# for i in
$(radosgw-admin bucket list | jq -r '.[]');
do radosgw-admin bucket stats --bucket=$i |
jq '.usage | ."rgw.main" | .size_kb' ; done
| awk '{ SUM += $1} END { print
SUM/1024/1024/1024 }'
6.59098
# ceph df
GLOBAL:
SIZE AVAIL RAW USED
%RAW USED
113 TiB 23 TiB 90 TiB
79.25
POOLS:
NAME ID
USED %USED MAX AVAIL OBJECTS
Primary-ubuntu-1 5
27 TiB 87.56 3.9 TiB 7302534
.users.uid 15
6.8 KiB 0 3.9 TiB 39
.users 16
335 B 0 3.9 TiB 20
.users.swift 17
14 B 0 3.9 TiB 1
.rgw.buckets
19 15 TiB 79.88
3.9 TiB 8787763
.users.email 22
0 B 0 3.9 TiB 0
.log 24
109 MiB 0 3.9 TiB 102301
.rgw.buckets.extra 37
0 B 0 2.6 TiB 0
.rgw.root 44
2.9 KiB 0 2.6 TiB 16
.rgw.meta 45
1.7 MiB 0 2.6 TiB 6249
.rgw.control 46
0 B 0 2.6 TiB 8
.rgw.gc 47
0 B 0 2.6 TiB 32
.usage 52
0 B 0 2.6 TiB 0
.intent-log 53
0 B 0 2.6 TiB 0
default.rgw.buckets.non-ec 54
0 B 0 2.6 TiB 0
.rgw.buckets.index 55
0 B 0 2.6 TiB 11485
.rgw 56
491 KiB 0 2.6 TiB 1686
Primary-ubuntu-1-ssd 57
1.2 TiB 92.39 105 GiB 379516
I am not too sure if the issue relates to
the BlueStore overhead as I would probably
have seen the discrepancy in my
Primary-ubuntu-1 pool as well. However, the
data usage on Primary-ubuntu-1 pool seems to
be consistent with my expectations (precise
numbers to be verified soon). The issues
seems to be only with the .rgw-buckets pool
where the "ceph df " output shows 15TB of
usage and the sum of all buckets in that
pool shows just over 6.5TB.
Cheers
Andrei
Hi Andrei,
The most obvious reason is space usage
overhead caused by BlueStore allocation
granularity, e.g. if
bluestore_min_alloc_size is 64K and
average object size is 16K one will waste
48K per object in average. This is rather
a speculation so far as we lack key the
information about your cluster:
- Ceph version
- What are the main devices for OSD: hdd
or ssd.
- BlueStore or FileStore.
- average RGW object size.
You might also want to collect and share
performance counter dumps (ceph daemon
osd.N perf dump) and "
" reports from a couple of your OSDs.
Thanks,
Igor
On 7/2/2019
11:43 AM, Andrei Mikhailovsky wrote:
Bump!
Hi
Could someone please explain
/ show how to troubleshoot the
space usage in Ceph and how to
reclaim the unused space?
I have a small cluster with
40 osds, replica of 2, mainly
used as a backend for cloud
stack as well as the S3 gateway.
The used space doesn't make any
sense to me, especially the rgw
pool, so I am seeking help.
Here is what I found from the
client:
Ceph -s shows the
usage: 89 TiB used, 24 TiB
/ 113 TiB avail
Ceph df shows:
Primary-ubuntu-1
5 27 TiB 90.11
3.0 TiB 7201098
Primary-ubuntu-1-ssd
57 1.2 TiB 89.62
143 GiB 359260
.rgw.buckets
19 15 TiB
83.73 3.0 TiB
8742222
the usage of the
Primary-ubuntu-1 and
Primary-ubuntu-1-ssd is in
line with my expectations.
However, the .rgw.buckets pool
seems to be using way too much.
The usage of all rgw buckets
shows 6.5TB usage (looking at
the size_kb values from the
"radosgw-admin bucket stats"). I
am trying to figure out why
.rgw.buckets is using 15TB of
space instead of the 6.5TB as
shown from the bucket usage.
Thanks
Andrei
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com