On 10/22/2018 8:18 PM, Igor Fedotov wrote:
On 10/22/2018 7:49 PM, Sage Weil wrote:
On Mon, 22 Oct 2018, Igor Fedotov wrote:
Hi folks,
doing the last cleanup for https://github.com/ceph/ceph/pull/19454
I realized that we still include the space for separate DB volume
into total
(and hence treat it as used too) space reported by BlueStore.
It seems we had such a discussion a while ago but unfortunately I
don't recall
the results.
See BlueStore::statfs(...)
if (bluefs) {
// include dedicated db, too, if that isn't the shared device.
if (bluefs_shared_bdev != BlueFS::BDEV_DB) {
buf->total += bluefs->get_total(BlueFS::BDEV_DB);
}
I'm not sure if there are any rationales behind that. And have a
strong desire
to remove it from 'total' calculation.
Just want to share two options for the new "ceph df' output.
3x OSD config: 10 Gb block device + 1Gb DB device + 1 Gb WAL device.
1) DB space included.
GLOBAL:
SIZE AVAIL USED RAW USED %RAW USED
33 GiB 27 GiB 2.9 GiB 5.9 GiB 18.01
POOLS:
NAME ID STORED OBJECTS USED
%USED MAX
AVAIL
cephfs_data_a 1 0 B 0 0 B
0 8.9 GiB
cephfs_metadata_a 2 2.2 KiB 22 384 KiB
0 8.9 GiB
2) DB space isn't included.
GLOBAL:
SIZE AVAIL USED RAW USED %RAW USED
30 GiB 27 GiB 1.9 MiB 3.0 GiB 10.01
POOLS:
NAME ID STORED OBJECTS USED
%USED MAX
AVAIL
cephfs_data_a 1 0 B 0 0 B
0 8.9 GiB
cephfs_metadata_a 2 2.2 KiB 22 384 KiB
0 8.9 GiB
So for the first case GLOBAL SIZE includes both block and DB devices
space.
RAW USED includes 3GB for separate BlueFS volumes and ~3GB permanently
reserved for BlueFS as slow device as per bluestore_bluefs_min_free
config
parameter. Not to mention space actually allocated for user data.
And AVAIL is equal to SIZE - RAW USED;
For the second option (I'm inclined to) all the numbers lack that DB
device
space and hence IMO provide more consistent picture.
Please note that 3x1GB allocated for WAL aren't taken into account
in both
cases.
So the question is what variant do we prefer? Are there any reasons
to account
for DB space here?
Doesn't it make sense to export BlueFS stats (total/avail)
separately which
(along with existing "internal_metadata" and "omap_allocated"
fields) allows
to build a perfect and consistent view of DB space usage if needed?
I think this is the key. The problem is that the db space is
'available'
space, but only for omap (or metadata). I think the way to complete the
picture may be to start with (1), but then also expose separate
data_available and omap_available values?
Hence if going with (1) we get something like that one day (given that
DB uses 1GB):
GLOBAL:
SIZE AVAIL DATA_AVAIL USED RAW USED %RAW USED
33 GiB 29 GiB 27 GIB 2.9 GiB 5.9 GiB 18.01
Not very transparent IMO.
I'd prefer to split data and metadata usage and have something like
the following:
Which rather tend to (2) !
DATA:
SIZE AVAIL USED RAW USED %RAW USED
30 GiB 27 GiB 1.9 MiB 3.0 GiB 10.01
META:
SIZE AVAIL USED OMAP_USED %USED
6 GiB 5 GiB 1 GiB 256MIB ZZZ
sage