Re: Very slow cache flush

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Just to note, if you are going to be writing a LOT of data into the EC pool that will take over an hour to finish, you will want to modify these settings until you are done.  If you are writing data in non-stop for 3 hours, then after about the 30 minute mark you will still be doing your active writing, but the cache tier will be forcing a second set of writes and deletes on the disks until you are done.  This only slows down the cluster and makes your writes take longer.  If that isn't a concern, then it doesn't matter and just keep the settings for simplicity and let it finish eventually.  On the other hand if you are running into too slow of performance after writing non-stop data for a while, you'll want to increase the cache_min_flush_age and cache_min_evict_age settings until you are done.

If you regularly dump large amounts of data into the EC pool, then you might just want to set these settings up a little higher to help with the occasional ingestion.

On Wed, May 17, 2017 at 10:59 AM Ashley Merrick <ashley@xxxxxxxxxxxxxx> wrote:

Hello,

 

I have the same use case as you.

 

After using the same settings as you seems to be running correctly and clearing it self.

 

Will monitor and let you know.

 

Thanks for that.

 

,Ashley

 

From: David Turner [mailto:drakonstein@xxxxxxxxx]
Sent: 17 May 2017 22:52
To: Ashley Merrick <ashley@xxxxxxxxxxxxxx>; ceph-users@xxxxxxxx
Subject: Re: Very slow cache flush

 

It doesn't sound like you're actually dealing with slow speeds when you execute the flush/evict commands, just that it isn't keeping your cache as small as you'd expect.  What are your cache settings?  This is all math and equations.  One of your settings/variables must be off, skewing your result.

 

I use a cache pool in front of an EC pool, because that's still required in Jewel, but I don't need it for any speed performance of the underlying pool.  I have my settings so that my pool remains empty pretty much constantly.  I have a friend that has a similar setup, except his files are very likely to be read, modified, rewritten, etc for the next 24 hours once they are written, accessed, etc.

 

Here are my settings for my cache pool `ceph osd pool get {pool_name} all`.  Notice my targets are 0 for full_ratio, max_objects, etc.  Things stay in the cache because the cache_min_flush_age is 30 minutes and the cache_min_evict_age is an hour.  My friend's settings vary only in that his min_flush_age and min_evict_age are 24 hours and 30 hours respectively.  The default action as soon as an object reaches the minimum age is to be flushed and evicted because the target for the cache is to be completely empty.  I wrote over 200GB of data to this EC pool yesterday and today there Is less than 500MB in its cache because of some minor reads that have been happening this morning.  The cache pool is regularly 0% full without any intervention.

  hit_set_type: bloom

  hit_set_period: 3600

  hit_set_count: 1

  target_max_objects: 0

  target_max_bytes: 2000000000000

  cache_target_dirty_ratio: 0

  cache_target_dirty_high_ratio: 0.6

  cache_target_full_ratio: 0

  cache_min_flush_age: 1800

  cache_min_evict_age: 3600

  min_read_recency_for_promote: 0

  min_write_recency_for_promote: 1

 

On the other hand, if you are using cache tiering for actual performance tuning instead of just to satisfy the demands of an EC pool, then these settings probably don't make much sense for your use case.

 

On Wed, May 17, 2017 at 6:57 AM Ashley Merrick <ashley@xxxxxxxxxxxxxx> wrote:

Hello,

I recently doubles the PG's on my cache pool, from 256 to 512.

I have made sure all the PG's in question have been deep scrubbed however my cache no longer wish flush or evict at any decent speed.

If left to it's own it does a couple of objects here and there, but eventually grows and gets danjerouslt near its limit.

Currently I am having to run the cache flush / evict command in a loop which is very slowly bringing it down, burst of objects and then back to a few a second if lucky.

Both cache pool and underlying pool have plenty of capacity and performance is fine.

The only thing that has changed is amount of PG's and it's like somewhere in the flush calculation it hasn't updated with the new amount.

Have a max bytes limit and fairly Low flush and evict ratios to leave normally plenty of %

,Ashley

Sent from my iPhone
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux