Re: Set cache tier pool forward state automatically!

Mihai Gheorghe <mcapsali@xxxxxxxxx> · Thu, 4 Feb 2016 02:07:08 +0200

Does the cache pool flush when setting a min value ratio if the pool doesn't meet the min_size? I mean ceph blocks only writes when an osd fails in a pool size of 2 or does it block reads too? 
Because on paper it looks good on a small cache pool, in case of osd failiure to set lowest ratio for flush, wait for it to finish and then set it in forward mode, or disable it completely untill it's fixed.
On 4 Feb 2016 01:57, "Robert LeBlanc" <robert@xxxxxxxxxxxxx> wrote:
-----BEGIN PGP SIGNED MESSAGE-----

Hash: SHA256

My experience with Hammer is showing that setting the pool to forward

mode is not evicting objects, nor do I think it is flushing objects.

We have had our pool in forward mode for weeks now and we still have

almost the same amount of I/O to it. There has been a slight shift

between SSD and HDD, but I think that is because some objects have

cooled off and others have been newly accessed. You may have better

luck adjusting the ratios, but we see that there is a big hit to our

cluster to do that. We usually do 1% every minute or two to help

reduce the impact of evicting the data (we usually drop the cache full

ratio 10% or so to evict some objects and we then toggle the cache

mode between writeback and forward periodically to warm up the cache.

Setting it to writeback will promote so many objects at once that it

severely impact our cluster. There is also a limit that we reach at

about 10K IOPs when in writeback where with forward I've seen spikes

to 64K IOPs. So we turn on writeback for 30-60 seconds (or until the

blocked I/O is too much for us to handle), then set it to forward for

60-120 second, rinse and repeat until the impact of writeback isn't so

bad, then set it back to forward for a couple more weeks).

Needless to say, cache tiering could use some more love. If I get some

time, I'd like to try and help that section of code, but I have a

couple other more pressing issues I'd like to track down first.

-----BEGIN PGP SIGNATURE-----

Version: Mailvelope v1.3.4

Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJWspPSCRDmVDuy+mK58QAAsQgP/15YrzV+BRt+CGnzZL/Q

w6PwnSdw4HBJT4OEqdg+kStCP+SqUSVCiJcdeHo5Sm40smEWVYRim3jsHBSg

Z4Woa31XsjYbEw3HCxIoI93OPhaKszOhvktKZxu1iSnyMDDJIYMARlYIjbfc

ToCOC/IVe2MMAEtVq+J2fm/NQy6VDGbaUuYcNtkIF41j7vKoNoE3h5qi+L0K

cVwUhVTcuSNDuiuJOoduM/vSH6nJzmCnypH1BDTcEOYpvmbXWJ0iTdej2Oa1

gVvV7SOcu4PkjzL9MmJB2Cjiiy/zWjUTfN01nBvIatwOjF7AE8vq2pLD9FIs

TxmzE4UZgjwJNbkDVQsgHPCeUlEll+t3QKbokpEkQDQgvIOs6NCbj0KYpuhC

DWtQCbgYsniT+Md1vWFMgqs0a45ulGxEKUWiUOEXgTJLHH+dbrW32MZEl1Gd

yTKyzFarbae6tbAmaMPC8l9vaj15t7bAB0KOokMqZied7EcM1ZoFVqKRahrm

73mIeHiDUwZ8gi+BHKX7OwqKt3VZJYf/+rNJx+g4kp5WN0FEkUMoqF75qO4p

62+PuQIwh6jUpB4cDsbEJd78UGbCptJBojmsNVogU+xiSXTKQmEduP0HqQfG

JhTLg3Un2C4/MSGbhRI26csFCzEi66iRXQWdfCITP4Um70KO6dE2C1MAveYg

hJ7b

=CaRF

-----END PGP SIGNATURE-----

----------------

Robert LeBlanc

PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1

On Wed, Feb 3, 2016 at 10:01 AM, Nick Fisk <nick@xxxxxxxxxx> wrote:

> I think this would be better to be done outside of Ceph. It should be quite simple for whatever monitoring software you are using to pick up the disk failure to set the target_dirty_ratio to a very low value or change the actual caching mode.

>

> Doing it in Ceph would be complicated as you are then asking Ceph to decide when you are in an at risk scenario, ie would you want it to flush your cache after a quick service reload or node reboot?

>

>> -----Original Message-----

>> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of

>> Mihai Gheorghe

>> Sent: 03 February 2016 16:57

>> To: ceph-users <ceph-users@xxxxxxxxxxxxxx>; ceph-users <ceph-

>> users@xxxxxxxx>

>> Subject:  Set cache tier pool forward state automatically!

>>

>> Hi,

>>

>> Is there a built in setting in ceph that would set the cache pool from

>> writeback to forward state automatically in case of an OSD fail from the pool?

>>

>> Let;s say the size of the cache pool is 2. If an OSD fails ceph blocks write to

>> the pool, making the VM that use this pool to be unaccesable. But an earlier

>> copy of the data is present on the cold storage pool prior to the last cache

>> flush.

>>

>> In this case, is it possible that when an OSD fails, the data on the cache pool

>> to be flushed onto the cold storage pool and set the forward flag

>> automatically on the cache pool? So that the VM can resume write to the

>> block device as soon as the cache is flushed from the pool and read/write

>> directly from the cold storage pool untill manual intervention on the cache

>> pool is done to fix it and set it back to writeback?

>>

>> This way we can get away with a pool size of 2 without worrying for too much

>> downtime!

>>

>> Hope i was explicit enough!

>

> _______________________________________________

> ceph-users mailing list

> ceph-users@xxxxxxxxxxxxxx

> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com