Re: Erasure pool performance expectations

Peter Kerdisle <peter.kerdisle@xxxxxxxxx> · Mon, 16 May 2016 12:49:48 +0200

Thanks yet again Nick for the help and explanations. I will experiment some more and see if I can get the slow requests further down and increase the overall performance.

On Mon, May 16, 2016 at 12:20 PM, Nick Fisk <nick@xxxxxxxxxx> wrote:

> -----Original Message-----

> From: Peter Kerdisle [mailto:peter.kerdisle@xxxxxxxxx]

> Sent: 16 May 2016 11:04

> To: Nick Fisk <nick@xxxxxxxxxx>

> Cc: ceph-users@xxxxxxxxxxxxxx

> Subject: Re:  Erasure pool performance expectations

>

>

> On Mon, May 16, 2016 at 11:58 AM, Nick Fisk <nick@xxxxxxxxxx> wrote:

> > -----Original Message-----

> > From: Peter Kerdisle [mailto:peter.kerdisle@xxxxxxxxx]

> > Sent: 16 May 2016 10:39

> > To: nick@xxxxxxxxxx

> > Cc: ceph-users@xxxxxxxxxxxxxx

> > Subject: Re:  Erasure pool performance expectations

> >

> > I'm forcing a flush by lower the cache_target_dirty_ratio to a lower value.

> > This forces writes to the EC pool, these are the operations I'm trying to

> > throttle a bit. I am understanding you correctly that's throttling only works

> for

> > the other way around? Promoting cold objects into the hot cache?

>

> Yes that’s correct. You want to throttle the flushes which is done by another

> setting(s)

>

> Firstly set something like this in your ceph.conf

> osd_agent_max_low_ops = 1

> osd_agent_max_ops = 4

> I did not know about this, that's great, I will play around with these.

>

>

> This controls how many parallel threads the tiering agent will use. You can

> bump them up later if needed.

>

> Next set on your cache pools, these two settings. Try and keep them about

> .2 apart. So something like .4 and .6 are good to start with.

> cache_target_dirty_ratio

> cache_target_dirty_high_ratio

> Here is actually the heart of the matter. Ideally I would love to run it at 0.0 if

> that makes sense. I want no dirty objects in my hot cache at all, has anybody

> ever tried this? Right now I'm just pushing cache_target_dirty_ratio during

> low activity moments by setting it to 0.2 and then bringing it back up to 0.6

> when it's done or activity starts up again.

You might want to rethink that slightly. Keeping in mind that with EC pools currently any write will force a promotion and then dirty the object. If you are then almost immediately flushing the objects back down, you are going to end up with a lot of amplification for writes. You want to keep dirty objects in the cache pool so that you don't incur this penalty if they are going to be written to again in the near future.

I'm guessing what you want is a buffer, so that you can have bursts of activity without incurring the performance penalty of flushing? That’s hopefully what the high and low flushes should give you. By setting to .4 and .6, you will have a .2x"Cache Tier Capacity" buffer of cache tier space, where only slow flushing will occur.

>

>

> And let me know if that helps.

>

>

>

> >

> > The measurement is a problem for me at the moment. I'm trying to get the

> > perf dumps into collectd/graphite but it seems I need to hand roll a solution

> > since the plugins I found are not working anymore. What I'm doing now is

> > just summing the bandwidth statistics from my nodes to get an

> > approximated number. I hope to make some time this week to write a

> > collectd plugin to fetch get the actual stats from perf dumps.

>

> I've used diamond to collect the stats and it worked really well. I can share

> my graphite query to sum the promote/flush rates as well if it helps?

> I will check out diamond, are you using this specifically?

> https://github.com/BrightcoveOS/Diamond/wiki/collectors-CephCollector

>

> It would be great if you could share your graphite queries :)

You will need to modify them to suit your environment. But in the below examples 4 and 7 are the SSD's in the cache tier. You can probably change the "Ceph-Test" server to "*" to catch all servers

alias(scaleToSeconds(nonNegativeDerivative(sumSeries(servers.Ceph-Test.CephCollector.ceph.osd.[4|7].osd.op_r)),1),"SSD R IOP/s")

alias(scaleToSeconds(nonNegativeDerivative(sumSeries(servers.Ceph-Test.CephCollector.ceph.osd.[4|7].osd.tier_proxy_read)),1),"Proxy Reads/s")

alias(scaleToSeconds(nonNegativeDerivative(sumSeries(servers.Ceph-Test.CephCollector.ceph.osd.[4|7].osd.tier_try_flush)),1),"Flushes/s")

alias(scaleToSeconds(nonNegativeDerivative(sumSeries(servers.Ceph-Test.CephCollector.ceph.osd.[4|7].osd.tier_promote)),1),"Promotions/s")

alias(scaleToSeconds(nonNegativeDerivative(sumSeries(servers.Ceph-Test.CephCollector.ceph.osd.[4|7].osd.tier_evict)),1),"Evictions/s")

alias(scaleToSeconds(nonNegativeDerivative(sumSeries(servers.Ceph-Test.CephCollector.ceph.osd.[4|7].osd.tier_proxy_write)),1),"Proxy Writes/s")

alias(scale(servers.Ceph-Test.CephPoolsCollector.ceph.pools.cache2.MB_used,0.001024),"GB Cache Used")

alias(scale(servers.Ceph-Test.CephPoolsCollector.ceph.pools.cache2.target_max_MB,0.001024),"Target Max GB")

alias(scale(servers.Ceph-Test.CephPoolsCollector.ceph.pools.cache2.dirty_objects,0.004),"Dirty GB")

>

> >

> > I confirmed the settings are indeed correctly picked up across the nodes in

> > the cluster.

>

> Good, glad we got that sorted

>

> >

> > I tried switching my pool to readforward since for my needs the EC pool is

> > fast enough for reads but I got scared when I got the warning about data

> > corruption. How safe is readforward really at this point? I noticed the

> option

> > was removed from the latest docs while still living on the google cached

> > version:

> http://webcache.googleusercontent.com/search?q=cache:http://d

> > ocs.ceph.com/docs/master/rados/operations/cache-tiering/

>

> Not too sure about the safety, but I'm in the view that those extra modes

> probably aren’t needed, I'm pretty sure the same effect can be controlled via

> the recency settings (Someone correct me please). The higher the recency

> settings, the less likely an object will be chosen to be promoted into the

> cache tier. If you set the min_recency for reads to be higher than the max

> hit_set count. Then in theory no reads will ever cause an object to be

> promoted.

> You are right, your earlier help made me do exactly that and things have

> been working better since.

>

> Thanks!

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com