Mixed Bluestore and Filestore NVMe OSDs for RGW metadata both running out of space

David Turner <drakonstein@xxxxxxxxx> · Wed, 29 Aug 2018 16:55:15 -0700

osd daemon perf dump for a one of my bluestore NVMe OSDs has [1] this excerpt.  I grabbed those stats based on Wido's [2] script to determine how much DB overhead you have per object.  My [3] calculations for this particular OSD are staggering.  99% of the space used on this OSD is being consumed by the DB.  This particular OSD is sitting between 90%-97% disk usage with an occasional drop to 80%, but then back up.  It's fluctuating wildly from one minute to the next.

One of my filestore NVMe OSDs in the same cluster has 99% of its used space in ./current/omap/

This is causing IO stalls as well as OSDs flapping on the cluster.  Does anyone have any ideas of anything I can try?  It's definitely not the actual PGs on the OSDs.  I tried balancing the weights of the OSDs to better distribute the data, but moving the PGs around seemed to make things worse.  Thank you.

[1] "bluestore_onodes": 167,
"stat_bytes_used": 143855271936,
"db_used_bytes": 142656667648,

[2] https://gist.github.com/wido/b1328dd45aae07c45cb8075a24de9f1f

[3] Average object size = 821MB
DB overhead per object = 814MB
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com