Re: Include BlueFS DB space in total/used stats or not?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 10/22/18 12:18 PM, Igor Fedotov wrote:


On 10/22/2018 7:49 PM, Sage Weil wrote:
On Mon, 22 Oct 2018, Igor Fedotov wrote:
Hi folks,

doing the last cleanup for https://github.com/ceph/ceph/pull/19454

I realized that we still include the space for separate DB volume into total
(and hence treat it as used too) space reported by BlueStore.

It seems we had such a discussion a while ago but unfortunately I don't recall
the results.


See BlueStore::statfs(...)

  if (bluefs) {
     // include dedicated db, too, if that isn't the shared device.
     if (bluefs_shared_bdev != BlueFS::BDEV_DB) {
       buf->total += bluefs->get_total(BlueFS::BDEV_DB);
     }

I'm not sure if there are any rationales behind that. And have a strong desire
to remove it from 'total' calculation.

Just want to share two options for the new "ceph df' output.

3x OSD config: 10 Gb block device + 1Gb DB device + 1 Gb WAL device.

1) DB space included.

GLOBAL:
     SIZE       AVAIL      USED        RAW USED     %RAW USED
     33 GiB     27 GiB     2.9 GiB      5.9 GiB         18.01
POOLS:
     NAME                  ID     STORED      OBJECTS USED %USED     MAX
AVAIL
     cephfs_data_a         1          0 B           0         0 B
0       8.9 GiB
     cephfs_metadata_a     2      2.2 KiB          22     384 KiB
0       8.9 GiB

2) DB space isn't included.

GLOBAL:
     SIZE       AVAIL      USED        RAW USED     %RAW USED
     30 GiB     27 GiB     1.9 MiB      3.0 GiB         10.01
POOLS:
     NAME                  ID     STORED      OBJECTS USED %USED     MAX
AVAIL
     cephfs_data_a         1          0 B           0         0 B
0       8.9 GiB
     cephfs_metadata_a     2      2.2 KiB          22     384 KiB
0       8.9 GiB

So for the first case GLOBAL SIZE includes both block and DB devices space.
RAW USED includes 3GB for separate BlueFS volumes and ~3GB permanently
reserved for BlueFS as slow device as per bluestore_bluefs_min_free config
parameter. Not to mention space actually allocated for user data.

And AVAIL is equal to SIZE - RAW USED;

For the second option (I'm inclined to) all the numbers lack that DB device
space and hence IMO provide more consistent picture.

Please note that 3x1GB allocated for WAL aren't taken into account in both
cases.

So the question is what variant do we prefer? Are there any reasons to account
for DB space here?

Doesn't it make sense to export BlueFS stats (total/avail) separately which (along with existing "internal_metadata" and "omap_allocated" fields) allows
to build a perfect and consistent view of DB space usage if needed?
I think this is the key.  The problem is that the db space is 'available'
space, but only for omap (or metadata).  I think the way to complete the
picture may be to start with (1), but then also expose separate
data_available and omap_available values?
Hence if going with (1) we get something like that one day (given that DB uses 1GB):

GLOBAL:
     SIZE       AVAIL      DATA_AVAIL   USED        RAW USED     %RAW USED
     33 GiB     29 GiB     27 GIB       2.9 GiB      5.9 GiB         18.01

Not very transparent IMO.
I'd prefer to split data and metadata usage and have something like the following:

DATA:
     SIZE       AVAIL      USED        RAW USED     %RAW USED
     30 GiB     27 GiB     1.9 MiB      3.0 GiB         10.01
META:
     SIZE       AVAIL      USED        OMAP_USED  %USED
     6 GiB      5 GiB     1 GiB        256MIB     ZZZ


That would be fantastic!  I'm all for it.



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux