On 14/08/2018 15:57, Emmanuel Lacour wrote: > Le 13/08/2018 à 16:58, Jason Dillaman a écrit : >> >> See [1] for ways to tweak the bluestore cache sizes. I believe that by >> default, bluestore will not cache any data but instead will only >> attempt to cache its key/value store and metadata. > > I suppose too because default ratio is to cache as much as possible k/v > up to 512M and hdd cache is 1G by default. > > I tried to increase hdd cache up to 4G and it seems to be used, 4 osd > processes uses 20GB now. > >> In general, however, I would think that attempting to have bluestore >> cache data is just an attempt to optimize to the test instead of >> actual workloads. Personally, I think it would be more worthwhile to >> just run 'fio --ioengine=rbd' directly against a pre-initialized image >> after you have dropped the cache on the OSD nodes. > > So with bluestore, I assume that we need to think more of client page > cache (at least when using a VM) when with old filestore both osd and > client cache where used. > > For benchmark, I did real benchmark here for the expected app workload > of this new cluster and it's ok for us :) > > > Thanks for your help Jason. Shifting over a discussion from IRC and taking the liberty to resurrect an old thread, as I just ran into the same (?) issue. I see *significantly* reduced performance on RBD reads, compared to writes with the same parameters. "rbd bench --io-type read" gives me 8K IOPS (with the default 4K I/O size), whereas "rbd bench --io-type write" produces more than twice that. I should probably add that while my end result of doing an "rbd bench --io-type read" is about half of what I get from a write benchmark, the intermediate ops/sec output fluctuates from > 30K IOPS (about twice the write IOPS) to about 3K IOPS (about 1/6 of what I get for writes). So really, my read IOPS are all over the map (and terrible on average), whereas my write IOPS are not stellar, but consistent. This is an all-bluestore cluster on spinning disks with Luminous, and I've tried the following things: - run rbd bench with --rbd_readahead_disable_after_bytes=0 and --rbd_readahead_max_bytes=4194304 (per http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-March/008271.html) - configure OSDs with a larger bluestore_cache_size_hdd (4G; default is 1G) - configure OSDs with bluestore_cache_kv_ratio = .49, so that rather than using 1%/99%/0% for metadata/KV data/objects, the OSDs use 1%/49%/50% None of the above produced any tangible improvement. Benchmark results are at http://paste.openstack.org/show/736314/ if anyone wants to take a look. I'd be curious to see if anyone has a suggestion on what else to try. Thanks in advance! Cheers, Florian _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com