Re: Set cache tier pool forward state automatically!

Christian Balzer <chibi@xxxxxxx> · Thu, 4 Feb 2016 13:00:02 +0900

On Wed, 3 Feb 2016 16:57:09 -0700 Robert LeBlanc wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
> 
> My experience with Hammer is showing that setting the pool to forward
> mode is not evicting objects, nor do I think it is flushing objects.
>
Same here (with Firefly).

> We have had our pool in forward mode for weeks now and we still have
> almost the same amount of I/O to it. There has been a slight shift
> between SSD and HDD, but I think that is because some objects have
> cooled off and others have been newly accessed. You may have better
> luck adjusting the ratios, but we see that there is a big hit to our
> cluster to do that. We usually do 1% every minute or two to help
> reduce the impact of evicting the data (we usually drop the cache full
> ratio 10% or so to evict some objects and we then toggle the cache
> mode between writeback and forward periodically to warm up the cache.
> Setting it to writeback will promote so many objects at once that it
> severely impact our cluster. There is also a limit that we reach at
> about 10K IOPs when in writeback where with forward I've seen spikes
> to 64K IOPs. So we turn on writeback for 30-60 seconds (or until the
> blocked I/O is too much for us to handle), then set it to forward for
> 60-120 second, rinse and repeat until the impact of writeback isn't so
> bad, then set it back to forward for a couple more weeks).
>
That's an interesting strategy, I suppose you haven't run into the issue I
wrote about 2 days ago when switching to forward while running rdb bench?

In my case I venture that the number of really hot objects is small enough
to not overwhelm things and that 5K IOPS would be all that cluster ever
needs to provide.

Regards,

Christian

> Needless to say, cache tiering could use some more love. If I get some
> time, I'd like to try and help that section of code, but I have a
> couple other more pressing issues I'd like to track down first.
> -----BEGIN PGP SIGNATURE-----
> Version: Mailvelope v1.3.4
> Comment: https://www.mailvelope.com
> 
> wsFcBAEBCAAQBQJWspPSCRDmVDuy+mK58QAAsQgP/15YrzV+BRt+CGnzZL/Q
> w6PwnSdw4HBJT4OEqdg+kStCP+SqUSVCiJcdeHo5Sm40smEWVYRim3jsHBSg
> Z4Woa31XsjYbEw3HCxIoI93OPhaKszOhvktKZxu1iSnyMDDJIYMARlYIjbfc
> ToCOC/IVe2MMAEtVq+J2fm/NQy6VDGbaUuYcNtkIF41j7vKoNoE3h5qi+L0K
> cVwUhVTcuSNDuiuJOoduM/vSH6nJzmCnypH1BDTcEOYpvmbXWJ0iTdej2Oa1
> gVvV7SOcu4PkjzL9MmJB2Cjiiy/zWjUTfN01nBvIatwOjF7AE8vq2pLD9FIs
> TxmzE4UZgjwJNbkDVQsgHPCeUlEll+t3QKbokpEkQDQgvIOs6NCbj0KYpuhC
> DWtQCbgYsniT+Md1vWFMgqs0a45ulGxEKUWiUOEXgTJLHH+dbrW32MZEl1Gd
> yTKyzFarbae6tbAmaMPC8l9vaj15t7bAB0KOokMqZied7EcM1ZoFVqKRahrm
> 73mIeHiDUwZ8gi+BHKX7OwqKt3VZJYf/+rNJx+g4kp5WN0FEkUMoqF75qO4p
> 62+PuQIwh6jUpB4cDsbEJd78UGbCptJBojmsNVogU+xiSXTKQmEduP0HqQfG
> JhTLg3Un2C4/MSGbhRI26csFCzEi66iRXQWdfCITP4Um70KO6dE2C1MAveYg
> hJ7b
> =CaRF
> -----END PGP SIGNATURE-----
> ----------------
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
> 
> 
> On Wed, Feb 3, 2016 at 10:01 AM, Nick Fisk <nick@xxxxxxxxxx> wrote:
> > I think this would be better to be done outside of Ceph. It should be
> > quite simple for whatever monitoring software you are using to pick up
> > the disk failure to set the target_dirty_ratio to a very low value or
> > change the actual caching mode.
> >
> > Doing it in Ceph would be complicated as you are then asking Ceph to
> > decide when you are in an at risk scenario, ie would you want it to
> > flush your cache after a quick service reload or node reboot?
> >
> >> -----Original Message-----
> >> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf
> >> Of Mihai Gheorghe
> >> Sent: 03 February 2016 16:57
> >> To: ceph-users <ceph-users@xxxxxxxxxxxxxx>; ceph-users <ceph-
> >> users@xxxxxxxx>
> >> Subject:  Set cache tier pool forward state automatically!
> >>
> >> Hi,
> >>
> >> Is there a built in setting in ceph that would set the cache pool from
> >> writeback to forward state automatically in case of an OSD fail from
> >> the pool?
> >>
> >> Let;s say the size of the cache pool is 2. If an OSD fails ceph
> >> blocks write to the pool, making the VM that use this pool to be
> >> unaccesable. But an earlier copy of the data is present on the cold
> >> storage pool prior to the last cache flush.
> >>
> >> In this case, is it possible that when an OSD fails, the data on the
> >> cache pool to be flushed onto the cold storage pool and set the
> >> forward flag automatically on the cache pool? So that the VM can
> >> resume write to the block device as soon as the cache is flushed from
> >> the pool and read/write directly from the cold storage pool untill
> >> manual intervention on the cache pool is done to fix it and set it
> >> back to writeback?
> >>
> >> This way we can get away with a pool size of 2 without worrying for
> >> too much downtime!
> >>
> >> Hope i was explicit enough!
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@xxxxxxxxxxxxxx
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com