Re: Bluestore DB size and onode count

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/10/2018 12:22 PM, Igor Fedotov wrote:

Hi Nick.


On 9/10/2018 1:30 PM, Nick Fisk wrote:
If anybody has 5 minutes could they just clarify a couple of things for me

1. onode count, should this be equal to the number of objects stored on the OSD? Through reading several posts, there seems to be a general indication that this is the case, but looking at my OSD's the maths don't
work.
onode_count is the number of onodes in the cache, not the total number of onodes at an OSD.
Hence the difference...

Eg.
ceph osd df
ID CLASS WEIGHT  REWEIGHT SIZE  USE    AVAIL  %USE  VAR  PGS
  0   hdd 2.73679  1.00000 2802G  1347G  1454G 48.09 0.69 115

So 3TB OSD, roughly half full. This is pure RBD workload (no snapshots or anything clever) so let's assume worse case scenario of 4MB objects (Compression is on however, which would only mean more objects for given size)
1347000/4=~336750 expected objects

sudo ceph daemon osd.0 perf dump | grep blue
     "bluefs": {
     "bluestore": {
         "bluestore_allocated": 1437813964800,
         "bluestore_stored": 2326118994003,
         "bluestore_compressed": 445228558486,
         "bluestore_compressed_allocated": 547649159168,
         "bluestore_compressed_original": 1437773843456,
         "bluestore_onodes": 99022,
         "bluestore_onode_hits": 18151499,
         "bluestore_onode_misses": 4539604,
         "bluestore_onode_shard_hits": 10596780,
         "bluestore_onode_shard_misses": 4632238,
         "bluestore_extents": 896365,
         "bluestore_blobs": 861495,

99022 onodes, anyone care to enlighten me?

2. block.db Size
sudo ceph daemon osd.0 perf dump | grep db
         "db_total_bytes": 8587829248,
         "db_used_bytes": 2375024640,

2.3GB=0.17% of data size. This seems a lot lower than the 1% recommendation (10GB for every 1TB) or 4% given in the official docs. I know that different workloads will have differing overheads and potentially smaller objects. But am I understanding these figures
correctly as they seem dramatically lower?
Just in case - is slow_used_bytes equal to 0? Some DB data might reside at slow device if spill over has happened. Which doesn't require full DB volume to happen - that's by RocksDB's design.

And recommended numbers are a bit... speculative. So it's quite possible that you numbers are absolutely adequate.

FWIW, these are the numbers I came up with after examining the SST files generated under different workloads:

https://drive.google.com/file/d/1Ews2WR-y5k3TMToAm0ZDsm7Gf_fwvyFw/view?usp=sharing


Regards,
Nick

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux