> Op 16 oktober 2017 om 18:14 schreef Richard Hesketh <richard.hesketh@xxxxxxxxxxxx>: > > > On 16/10/17 13:45, Wido den Hollander wrote: > >> Op 26 september 2017 om 16:39 schreef Mark Nelson <mnelson@xxxxxxxxxx>: > >> On 09/26/2017 01:10 AM, Dietmar Rieder wrote: > >>> thanks David, > >>> > >>> that's confirming what I was assuming. To bad that there is no > >>> estimate/method to calculate the db partition size. > >> > >> It's possible that we might be able to get ranges for certain kinds of > >> scenarios. Maybe if you do lots of small random writes on RBD, you can > >> expect a typical metadata size of X per object. Or maybe if you do lots > >> of large sequential object writes in RGW, it's more like Y. I think > >> it's probably going to be tough to make it accurate for everyone though. > > > > So I did a quick test. I wrote 75.000 objects to a BlueStore device: > > > > root@alpha:~# ceph daemon osd.0 perf dump|jq '.bluestore.bluestore_onodes' > > 75085 > > root@alpha:~# > > > > I then saw the RocksDB database was 450MB in size: > > > > root@alpha:~# ceph daemon osd.0 perf dump|jq '.bluefs.db_used_bytes' > > 459276288 > > root@alpha:~# > > > > 459276288 / 75085 = 6116 > > > > So about 6kb of RocksDB data per object. > > > > Let's say I want to store 1M objects in a single OSD I would need ~6GB of DB space. > > > > Is this a safe assumption? Do you think that 6kb is normal? Low? High? > > > > There aren't many of these numbers out there for BlueStore right now so I'm trying to gather some numbers. > > > > Wido > > If I check for the same stats on OSDs in my production cluster I see similar but variable values: > > root@vm-ds-01:~/ceph-conf# for i in {0..9} ; do echo -n "osd.$i db per object: " ; expr `ceph daemon osd.$i perf dump | jq '.bluefs.db_used_bytes'` / `ceph daemon osd.$i perf dump | jq '.bluestore.bluestore_onodes'` ; done > osd.0 db per object: 7490 > osd.1 db per object: 7523 > osd.2 db per object: 7378 > osd.3 db per object: 7447 > osd.4 db per object: 7233 > osd.5 db per object: 7393 > osd.6 db per object: 7074 > osd.7 db per object: 7967 > osd.8 db per object: 7253 > osd.9 db per object: 7680 > > root@vm-ds-02:~# for i in {10..19} ; do echo -n "osd.$i db per object: " ; expr `ceph daemon osd.$i perf dump | jq '.bluefs.db_used_bytes'` / `ceph daemon osd.$i perf dump | jq '.bluestore.bluestore_onodes'` ; done > osd.10 db per object: 5168 > osd.11 db per object: 5291 > osd.12 db per object: 5476 > osd.13 db per object: 4978 > osd.14 db per object: 5252 > osd.15 db per object: 5461 > osd.16 db per object: 5135 > osd.17 db per object: 5126 > osd.18 db per object: 9336 > osd.19 db per object: 4986 > > root@vm-ds-03:~# for i in {20..29} ; do echo -n "osd.$i db per object: " ; expr `ceph daemon osd.$i perf dump | jq '.bluefs.db_used_bytes'` / `ceph daemon osd.$i perf dump | jq '.bluestore.bluestore_onodes'` ; done > osd.20 db per object: 5115 > osd.21 db per object: 4844 > osd.22 db per object: 5063 > osd.23 db per object: 5486 > osd.24 db per object: 5228 > osd.25 db per object: 4966 > osd.26 db per object: 5047 > osd.27 db per object: 5021 > osd.28 db per object: 5321 > osd.29 db per object: 5150 > > root@vm-ds-04:~# for i in {30..39} ; do echo -n "osd.$i db per object: " ; expr `ceph daemon osd.$i perf dump | jq '.bluefs.db_used_bytes'` / `ceph daemon osd.$i perf dump | jq '.bluestore.bluestore_onodes'` ; done > osd.30 db per object: 6658 > osd.31 db per object: 6445 > osd.32 db per object: 6259 > osd.33 db per object: 6691 > osd.34 db per object: 6513 > osd.35 db per object: 6628 > osd.36 db per object: 6779 > osd.37 db per object: 6819 > osd.38 db per object: 6677 > osd.39 db per object: 6689 > > root@vm-ds-05:~# for i in {40..49} ; do echo -n "osd.$i db per object: " ; expr `ceph daemon osd.$i perf dump | jq '.bluefs.db_used_bytes'` / `ceph daemon osd.$i perf dump | jq '.bluestore.bluestore_onodes'` ; done > osd.40 db per object: 5335 > osd.41 db per object: 5203 > osd.42 db per object: 5552 > osd.43 db per object: 5188 > osd.44 db per object: 5218 > osd.45 db per object: 5157 > osd.46 db per object: 4956 > osd.47 db per object: 5370 > osd.48 db per object: 5117 > osd.49 db per object: 5313 > > I'm not sure why so much variance (these nodes are basically identical) and I think that the db_used_bytes includes the WAL at least in my case, as I don't have a separate WAL device. I'm not sure how big the WAL is relative to metadata and hence how much this might be thrown off, but ~6kb/object seems like a reasonable value to take for back-of-envelope calculating. > Yes, judging from your numbers 6kb/object seems reasonable. More datapoints are welcome in this case. Some input from a BlueStore dev might be helpful as well to see we are not drawing the wrong conclusions here. Wido > [bonus hilarity] > On my all-in-one-SSD OSDs, because bluestore reports them entirely as db space, I get results like: > > root@vm-hv-01:~# for i in {60..65} ; do echo -n "osd.$i db per object: " ; expr `ceph daemon osd.$i perf dump | jq '.bluefs.db_used_bytes'` / `ceph daemon osd.$i perf dump | jq '.bluestore.bluestore_onodes'` ; done > osd.60 db per object: 80273 > osd.61 db per object: 68859 > osd.62 db per object: 45560 > osd.63 db per object: 38209 > osd.64 db per object: 48258 > osd.65 db per object: 50525 > > Rich > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com