Re: RBD Cache and rbd-nbd

Jason Dillaman <jdillama@xxxxxxxxxx> · Mon, 14 May 2018 05:33:00 -0700

On Mon, May 14, 2018 at 12:15 AM, Marc Schöchlin <ms@xxxxxxxxxx> wrote:
> Hello Jason,
>
> many thanks for your informative response!
>
> Am 11.05.2018 um 17:02 schrieb Jason Dillaman:
>> I cannot speak for Xen, but in general IO to a block device will hit
>> the pagecache unless the IO operation is flagged as direct (e.g.
>> O_DIRECT) to bypass the pagecache and directly send it to the block
>> device.
> Sure, but it seems that xenserver just forwards io from virtual machines
> (vm: blkfront, dom-0: blkback) to the ndb device in dom-0.
>>> Sorry, my question was a bit unprecice: I was searching for usage statistics
>>> of the rbd cache.
>>> Is there also a possibility to gather rbd_cache usage statistics as a source
>>> of verification for optimizing the cache settings?
>> You can run "perf dump" instead of "config show" to dump out the
>> current performance counters. There are some stats from the in-memory
>> cache included in there.
> Great, i was not aware of that.
> There are really a lot of statistics which might be useful for analyzing
> whats going on or if the optimizations improve the performance of our
> systems.
>>> Can you provide some hints how to about adequate cache settings for a write
>>> intensive environment (70% write, 30% read)?
>>> Is it a good idea to specify a huge rbd cache of 1 GB with a max dirty age
>>> of 10 seconds?
>> Depends on your workload and your testing results. I suspect a
>> database on top of RBD is going to do its own read caching and will be
>> issuing lots of flush calls to the block device, potentially negating
>> the need for a large cache.
>
> Sure, reducing flushes with the acceptance of a degraded level of
> reliability seems to be one import key for improved performance.
>
>>>
>>> Our typical workload is originated over 70 percent in database write
>>> operations in the virtual machines.
>>> Therefore collecting write operations with rbd cache and writing them in
>>> chunks to ceph might be a good thing.
>>> A higher limit for "rbd cache max dirty" might be a adequate here.
>>> At the other side our read workload typically reads huge files in sequential
>>> manner.
>>>
>>> Therefore it might be useful to do start with a configuration like that:
>>>
>>> rbd cache size = 64MB
>>> rbd cache max dirty = 48MB
>>> rbd cache target dirty = 32MB
>>> rbd cache max dirty age = 10
>>>
>>> What is the strategy of librbd to write data to the storage from rbd_cache
>>> if "rbd cache max dirty = 48MB" is reached?
>>> Is there a reduction of io operations (merging of ios) compared to the
>>> granularity of writes of my virtual machines?
>> If the cache is full, incoming IO will be stalled as the dirty bits
>> are written back to the backing RBD image to make room available for
>> the new IO request.
> Sure, i will have a look at the statistics and the throughput.
> Is there any consolidation of write requests in rbd cache?
>
> Example:
> If a vm writes small io-requests to the ndb device with belong to the
> same rados object - does librbd consollidate these requests to  a single
> ceph io?
> What strategies does librd use for that?

The librbd cache will consolidate sequential dirty extents within the
same object, but it does not consolidate all dirty extents within the
same object to the same write request.

> Regards
> Marc
>

-- 
Jason
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com