Here is a strange problem that I don't seem to be able to figure out.
Some of our OSDs that have zero weight, and no PGs have lots of
allocated space:
[root@cephosd0032 ~]# ceph osd df
2925 ssd 0 1.00000 1.1 TiB 126 GiB 124 GiB 3.2 MiB
1021 MiB 984 GiB 11.32 0.15 0 up
2926 ssd 0 1.00000 1.1 TiB 126 GiB 124 GiB 3.2 MiB
1021 MiB 984 GiB 11.33 0.15 0 up
2927 ssd 0 1.00000 1.1 TiB 125 GiB 124 GiB 3.2 MiB
1021 MiB 984 GiB 11.31 0.15 0 up
2928 ssd 0 1.00000 1.1 TiB 126 GiB 124 GiB 3.2 MiB
1021 MiB 984 GiB 11.32 0.15 0 up
so 120GB+ allocated, but no PGs. The cluster is clean, all PGs are
active+clean, no rebalancing/recovery/etc. happening. There has been no
rebalancing/recovery for at least a couple of days (after having added
some nodes to the cluster).
Doing a perf dump on the OSD confirms the space usage - that it has
130GB or so allocated:
[root@cephosd0032 ~]# ceph daemon /var/run/ceph/ceph-osd.2925.asok perf dump
"bluestore_allocated": 133277466624,
"bluestore_stored": 133014427663,
Bringing the OSD down - ceph-objectstore-tool claims that there are no
objects stored in the OSD:
[root@cephosd0032 ~]# ceph-objectstore-tool --data-path
/var/lib/ceph/osd/ceph-2925 --op list
[root@cephosd0032 ~]#
No fragmentation on the OSD either:
[root@cephosd0032 ~]# ceph daemon /var/run/ceph/ceph-osd.2925.asok
bluestore allocator fragmentation block
"fragmentation_rating": 1.2165147867554808e-08
BlueFS also claims not to use significant space:
[root@cephosd0032 ~]# ceph daemon /var/run/ceph/ceph-osd.2925.asok
bluefs stats
1 : device size 0x11548000000 : own 0x[8518510000~b175d0000] =
0xb175d0000 : using 0xd1e0000(210 MiB) : bluestore has 0xeb3b630000(941
GiB) available
wal_total:0, db_total:1131368205516, slow_total:0
So the puzzle is, what is stored in that 130GB allocated space?
The cluster is Octopus 15.2.17. I came to looking at this by a mismatch
in free space on our OSDs that are in service. After adding some nodes
and waiting for the rebalance to finish, the free space didn't increase
by the amount of space that was added, although the total space in the
cluster did increase correctly (and data in pools didn't increase by a
significant amount to explain this). So there is a bit of a mystery
regarding what a bunch of space is allocated for.
Any ideas/pointers would be appreciated.
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx