-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 I think it depends. If there are no writes, then there probably won't be any blocking if there are less than min_size OSDs to service a PG. In an RBD workload, that is highly unlikely. If there is no blocking then setting the full_ratio to near zero should flush out the blocks even with less then min_size, but any write would stall it all, maybe. You would have to test it. Just be aware that you are trying something that I haven't heard anyone doing, so it may or may not work. Just do a lot of testing and think of all the possible failure scenarios that may be possible and try them. Honestly, if you are writing to it, I wouldn't trust less than size=3, min_size=2 with the automatic recovery. If you only have 2 copies and there is corruption, you can't easily tell which one is the right one, at least with 3 you can have a vote. Ceph is also supposed to get smarter about recovery and use the voting to auto recovery the "best" candidate and I hope with hashing, it will be a slam dunk for auto recovery. -----BEGIN PGP SIGNATURE----- Version: Mailvelope v1.3.4 Comment: https://www.mailvelope.com wsFcBAEBCAAQBQJWsuMoCRDmVDuy+mK58QAAsFYP/RZMRvc7THLM3b+ogEOZ gmKbqlIQSVncVJT7luItgjxwtlrVoNsAfgFhluk2Mzdo8v2Nwa4jYGquxkoz YIaoUmBgN5fapopKjCqJ3wIvd5+W1bT9ASlyksI/roIlNkI+p8mnFRsAHm3w Ik6gZB2YnSYI6mTDsUn2OKpB1u00AQmRDJqT61lRFsBdqmo1H8QM1bP8bp0C WZsoZpv0dyCLf/aIIe0PAsKrn53/Ha+gehVZST8TVdfkAJrikUiDUrtfPVDv XNmBKxODPg74ldHhSd2UTvWO84zv3gKipCJ4OGmOk0eQ8MJDVfpKm3wCDNbh hq7ywRrjIKmtu/5ppZ8o9UAvgzVREbeW4y5LYPbis/cO0T+PyZn+eb66j1qF VBGHgILDjFKxDi0nMDDrBpjYNzHfv7ZrDgALw9Awgx/3ogi9Iv6UGVRqvV+D WtvfXIbFJplB/58FARTaQClIkzVzIIMg3VKS6MjiFSk6gRoRwzmvjpmTHgzI 49wQTIwuIArWeITMjf4dxQKsWBqmh/+/D8dt4lb6OSvxYk983RuY4SSWjwIc sJMzBXVsQfuzE0iF0BuuqXdgRoPmr8FofDN/Vm6eF6wz6nGPGlUjT6t13b0y 1RLhEPtq2YcYNkiE6dT97g16PRCW7R7h1bnQ5EVUIzFb+Dzh7rtjTjyYmm2z geuA =gMKn -----END PGP SIGNATURE----- ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, Feb 3, 2016 at 5:07 PM, Mihai Gheorghe <mcapsali@xxxxxxxxx> wrote: > Does the cache pool flush when setting a min value ratio if the pool doesn't > meet the min_size? I mean ceph blocks only writes when an osd fails in a > pool size of 2 or does it block reads too? > > Because on paper it looks good on a small cache pool, in case of osd > failiure to set lowest ratio for flush, wait for it to finish and then set > it in forward mode, or disable it completely untill it's fixed. > > On 4 Feb 2016 01:57, "Robert LeBlanc" <robert@xxxxxxxxxxxxx> wrote: >> >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA256 >> >> My experience with Hammer is showing that setting the pool to forward >> mode is not evicting objects, nor do I think it is flushing objects. >> We have had our pool in forward mode for weeks now and we still have >> almost the same amount of I/O to it. There has been a slight shift >> between SSD and HDD, but I think that is because some objects have >> cooled off and others have been newly accessed. You may have better >> luck adjusting the ratios, but we see that there is a big hit to our >> cluster to do that. We usually do 1% every minute or two to help >> reduce the impact of evicting the data (we usually drop the cache full >> ratio 10% or so to evict some objects and we then toggle the cache >> mode between writeback and forward periodically to warm up the cache. >> Setting it to writeback will promote so many objects at once that it >> severely impact our cluster. There is also a limit that we reach at >> about 10K IOPs when in writeback where with forward I've seen spikes >> to 64K IOPs. So we turn on writeback for 30-60 seconds (or until the >> blocked I/O is too much for us to handle), then set it to forward for >> 60-120 second, rinse and repeat until the impact of writeback isn't so >> bad, then set it back to forward for a couple more weeks). >> >> Needless to say, cache tiering could use some more love. If I get some >> time, I'd like to try and help that section of code, but I have a >> couple other more pressing issues I'd like to track down first. >> -----BEGIN PGP SIGNATURE----- >> Version: Mailvelope v1.3.4 >> Comment: https://www.mailvelope.com >> >> wsFcBAEBCAAQBQJWspPSCRDmVDuy+mK58QAAsQgP/15YrzV+BRt+CGnzZL/Q >> w6PwnSdw4HBJT4OEqdg+kStCP+SqUSVCiJcdeHo5Sm40smEWVYRim3jsHBSg >> Z4Woa31XsjYbEw3HCxIoI93OPhaKszOhvktKZxu1iSnyMDDJIYMARlYIjbfc >> ToCOC/IVe2MMAEtVq+J2fm/NQy6VDGbaUuYcNtkIF41j7vKoNoE3h5qi+L0K >> cVwUhVTcuSNDuiuJOoduM/vSH6nJzmCnypH1BDTcEOYpvmbXWJ0iTdej2Oa1 >> gVvV7SOcu4PkjzL9MmJB2Cjiiy/zWjUTfN01nBvIatwOjF7AE8vq2pLD9FIs >> TxmzE4UZgjwJNbkDVQsgHPCeUlEll+t3QKbokpEkQDQgvIOs6NCbj0KYpuhC >> DWtQCbgYsniT+Md1vWFMgqs0a45ulGxEKUWiUOEXgTJLHH+dbrW32MZEl1Gd >> yTKyzFarbae6tbAmaMPC8l9vaj15t7bAB0KOokMqZied7EcM1ZoFVqKRahrm >> 73mIeHiDUwZ8gi+BHKX7OwqKt3VZJYf/+rNJx+g4kp5WN0FEkUMoqF75qO4p >> 62+PuQIwh6jUpB4cDsbEJd78UGbCptJBojmsNVogU+xiSXTKQmEduP0HqQfG >> JhTLg3Un2C4/MSGbhRI26csFCzEi66iRXQWdfCITP4Um70KO6dE2C1MAveYg >> hJ7b >> =CaRF >> -----END PGP SIGNATURE----- >> ---------------- >> Robert LeBlanc >> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >> >> On Wed, Feb 3, 2016 at 10:01 AM, Nick Fisk <nick@xxxxxxxxxx> wrote: >> > I think this would be better to be done outside of Ceph. It should be >> > quite simple for whatever monitoring software you are using to pick up the >> > disk failure to set the target_dirty_ratio to a very low value or change the >> > actual caching mode. >> > >> > Doing it in Ceph would be complicated as you are then asking Ceph to >> > decide when you are in an at risk scenario, ie would you want it to flush >> > your cache after a quick service reload or node reboot? >> > >> >> -----Original Message----- >> >> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf >> >> Of >> >> Mihai Gheorghe >> >> Sent: 03 February 2016 16:57 >> >> To: ceph-users <ceph-users@xxxxxxxxxxxxxx>; ceph-users <ceph- >> >> users@xxxxxxxx> >> >> Subject: Set cache tier pool forward state automatically! >> >> >> >> Hi, >> >> >> >> Is there a built in setting in ceph that would set the cache pool from >> >> writeback to forward state automatically in case of an OSD fail from >> >> the pool? >> >> >> >> Let;s say the size of the cache pool is 2. If an OSD fails ceph blocks >> >> write to >> >> the pool, making the VM that use this pool to be unaccesable. But an >> >> earlier >> >> copy of the data is present on the cold storage pool prior to the last >> >> cache >> >> flush. >> >> >> >> In this case, is it possible that when an OSD fails, the data on the >> >> cache pool >> >> to be flushed onto the cold storage pool and set the forward flag >> >> automatically on the cache pool? So that the VM can resume write to the >> >> block device as soon as the cache is flushed from the pool and >> >> read/write >> >> directly from the cold storage pool untill manual intervention on the >> >> cache >> >> pool is done to fix it and set it back to writeback? >> >> >> >> This way we can get away with a pool size of 2 without worrying for too >> >> much >> >> downtime! >> >> >> >> Hope i was explicit enough! >> > >> > _______________________________________________ >> > ceph-users mailing list >> > ceph-users@xxxxxxxxxxxxxx >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com