Re: Erasure pool performance expectations

Nick Fisk <nick@xxxxxxxxxx> · Mon, 16 May 2016 10:14:12 +0100

> -----Original Message-----
> From: Peter Kerdisle [mailto:peter.kerdisle@xxxxxxxxx]
> Sent: 15 May 2016 08:04
> To: Nick Fisk <nick@xxxxxxxxxx>
> Cc: ceph-users@xxxxxxxxxxxxxx
> Subject: Re:  Erasure pool performance expectations
> 
> Hey Nick,
> 
> I've been playing around with the osd_tier_promote_max_bytes_sec setting
> but I'm not really seeing any changes.
> 
> What would be expected when setting a max bytes value? I would expected
> that my OSDs would throttle themselves to this rate when doing promotes
> but this doesn't seem to be the case. When I set it to 2MB I would expect a
> node with 10 OSDs to do a max of 20MB/s during promotions. Is this math
> correct?

Yes that sounds about right, but this will only be for optional promotions (ie reads that meet the recency/hitset settings). If you are doing any writes, they will force the object to be promoted as you can't directly write to an EC pool. And also don't forget that once the cache pool is full, it will start evicting/flushing cold objects for every new object that gets promoted into it.

Few questions

1. What promotion rates are you seeing?

2. How are you measuring the promotion rate just out of interest?

3. Can you confirm that the OSD is picking up that setting  correctly by running something like (sudo ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show | grep promote)?

> 
> Thanks,
> 
> Peter
> 
> On Tue, May 10, 2016 at 3:48 PM, Nick Fisk <nick@xxxxxxxxxx> wrote:
> 
> 
> > -----Original Message-----
> > From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf
> Of
> > Peter Kerdisle
> > Sent: 10 May 2016 14:37
> > Cc: ceph-users@xxxxxxxxxxxxxx
> > Subject: Re:  Erasure pool performance expectations
> >
> > To answer my own question it seems that you can change settings on the
> fly
> > using
> >
> > ceph tell osd.* injectargs '--osd_tier_promote_max_bytes_sec 5242880'
> > osd.0: osd_tier_promote_max_bytes_sec = '5242880' (unchangeable)
> >
> > However the response seems to imply I can't change this setting. Is there
> an
> > other way to change these settings?
> 
> Sorry Peter, I missed your last email. You can also specify that setting in the
> ceph.conf, ie I have in mine
> 
> osd_tier_promote_max_bytes_sec = 4000000
> 
> 
> 
> >
> >
> > On Sun, May 8, 2016 at 2:37 PM, Peter Kerdisle
> <peter.kerdisle@xxxxxxxxx>
> > wrote:
> > Hey guys,
> >
> > I noticed the merge request that fixes the switch around here
> > https://github.com/ceph/ceph/pull/8912
> >
> > I had two questions:
> >
> > • Does this effect my performance in any way? Could it explain the slow
> > requests I keep having?
> > • Can I modify these settings manually myself on my cluster?
> > Thanks,
> >
> > Peter
> >
> >
> > On Fri, May 6, 2016 at 9:58 AM, Peter Kerdisle <peter.kerdisle@xxxxxxxxx>
> > wrote:
> > Hey Mark,
> >
> > Sorry I missed your message as I'm only subscribed to daily digests.
> >
> > Date: Tue, 3 May 2016 09:05:02 -0500
> > From: Mark Nelson <mnelson@xxxxxxxxxx>
> > To: ceph-users@xxxxxxxxxxxxxx
> > Subject: Re:  Erasure pool performance expectations
> > Message-ID: <df3de049-a7f9-7f86-3ed3-47079e4012b9@xxxxxxxxxx>
> > Content-Type: text/plain; charset=windows-1252; format=flowed
> > In addition to what nick said, it's really valuable to watch your cache
> > tier write behavior during heavy IO.  One thing I noticed is you said
> > you have 2 SSDs for journals and 7 SSDs for data.
> >
> > I thought the hardware recommendations were 1 journal disk per 3 or 4
> data
> > disks but I think I might have misunderstood it. Looking at my journal
> > read/writes they seem to be ok
> > though:
> https://www.dropbox.com/s/er7bei4idd56g4d/Screenshot%202016-
> > 05-06%2009.55.30.png?dl=0
> >
> > However I started running into a lot of slow requests (made a separate
> > thread for those: Diagnosing slow requests) and now I'm hoping these
> could
> > be related to my journaling setup.
> >
> > If they are all of
> > the same type, you're likely bottlenecked by the journal SSDs for
> > writes, which compounded with the heavy promotions is going to really
> > hold you back.
> > What you really want:
> > 1) (assuming filestore) equal large write throughput between the
> > journals and data disks.
> > How would one achieve that?
> >
> > 2) promotions to be limited by some reasonable fraction of the cache
> > tier and/or network throughput (say 70%).  This is why the
> > user-configurable promotion throttles were added in jewel.
> > Are these already in the docs somewhere?
> >
> > 3) The cache tier to fill up quickly when empty but change slowly once
> > it's full (ie limiting promotions and evictions).  No real way to do
> > this yet.
> > Mark
> >
> > Thanks for your thoughts.
> >
> > Peter
> >
> >
> 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com