Re: Erasure pool performance expectations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey Nick,

Thanks for taking the time to answer my questions. Some in-line comments.

On Tue, May 3, 2016 at 10:51 AM, Nick Fisk <nick@xxxxxxxxxx> wrote:
Hi Peter,


> -----Original Message-----
> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of
> Peter Kerdisle
> Sent: 02 May 2016 08:17
> To: ceph-users@xxxxxxxxxxxxxx
> Subject: Erasure pool performance expectations
>
> Hi guys,
>
> I am currently testing the performance of RBD using a cache pool and a 4/2
> erasure profile pool.
>
> I have two SSD cache servers (2 SSDs for journals, 7 SSDs for data) with
> 2x10Gbit bonded each and a six OSD nodes with a 10Gbit public and 10Gbit
> cluster network for the erasure pool (10x3TB without separate journal). This
> is all on Jewel.
>
> What I would like to know is if the performance I'm seeing is to be expected
> and if there is some way to test this in a more qualifiable way.
>
> Everything works as expected if the files are present on the cache pool,
> however when things need to be retrieved from the cache pool I see
> performance degradation. I'm trying to simulate real usage as much as
> possible and trying to retrieve files from the RBD volume over FTP from a
> client server. What I'm seeing is that the FTP transfer will stall for seconds at a
> time and then get some more data which results in an average speed of
> 200KB/s. From the cache this is closer to 10MB/s. Is this the expected
> behaviour from a erasure coded tier with cache in front?

Unfortunately yes. The whole Erasure/Cache thing only really works well if the data in the EC tier is only accessed infrequently, otherwise the overheads in cache promotion/flushing quickly brings the cluster down to its knees. However it looks as though you are mainly doing reads, which means you can probably alter your cache settings to not promote so aggressively on reads, as reads can be proxied through to the EC tier instead of promoting. This should reduce the amount of required cache promotions.

You are correct that reads have a lower priority of being cached, only when they are used very frequently should this be done in an ideal situation.
 

Can you try setting min_read_recency_for promote to something higher?

I looked into the setting before but I must admit it's exact purpose eludes me still. Would it be correct to simplify it as 'min_read_recency_for_promote determines the amount of times a piece would have to be read in a certain interval (set by hit_set_period) in order to promote it to the caching tier' ?


Also can you check what your hit_set_period and hit_set_count is currently set to.

hit_set_count is set to 1 and hit_set_period to 1800. 

What would increasing the hit_set_count do exactly?



> Right now I'm unsure how to scientifically test the performance retrieving
> files when there is a cache miss. If somebody could point me towards a
> better way of doing that I would appreciate the help.
>
> An other thing is that I'm seeing a lot of messages popping up in dmesg on
> my client server on which the RBD volumes are mounted. (IPs removed)
>
> [685881.477383] libceph: osd50 :6800 socket closed (con state OPEN)
> [685895.597733] libceph: osd54 :6808 socket closed (con state OPEN)
> [685895.663971] libceph: osd54 :6808 socket closed (con state OPEN)
> [685895.710424] libceph: osd54 :6808 socket closed (con state OPEN)
> [685895.749417] libceph: osd54 :6808 socket closed (con state OPEN)
> [685896.517778] libceph: osd54 :6808 socket closed (con state OPEN)
> [685906.690445] libceph: osd74 :6824 socket closed (con state OPEN)
>
> Is this a symptom of something?

This is just stale connections to the OSD's timing out after the idle period and is nothing to worry about.
 
Glad to hear that, I was fearing something might be wrong.

Thanks again.

Peter
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux