My read of that doc is that you still need to either set the configs to force all objects to be flushed or use the rados command to flush/evict all objects. -Sam On Wed, Nov 18, 2015 at 2:38 AM, Nick Fisk <nick@xxxxxxxxxx> wrote: > Hi Robert, > >> -----Original Message----- >> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of >> Robert LeBlanc >> Sent: 18 November 2015 00:47 >> To: Ceph-User <ceph-users@xxxxxxxx> >> Subject: SSD Caching Mode Question >> >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA256 >> >> We are inserting an SSD tier into our very busy cluster and I have a > question >> regrading writeback and forward modes. >> >> Write back is the "normal" mode for RBD with VMs. When we put the tier in >> writeback mode we see objects are being promoted and once the ratio is >> reached objects are evicted, this works as expected. When we place the > tier >> into forward mode, we don't see any objects being evicted to the base tier >> when they are written to as described in the manual [1]. >> Is this a bug? We are running 0.94.5. >> >> Now, I usually like things to work they way they are described in the > manual, >> however this "bug" is a bit advantageous for us. It appears that we don't >> have enough IOPs in the SSD tier to handle the steady state (we still > have >> some more SSDs to add in, but it requires shuffling hardware around). >> However, when we put the tier into forward mode, the latency drops and >> we get much more performance from the Ceph cluster. In write back we >> seem to be capped at about 9K IOPs accroding to ceph -w with spikes up to >> about 15K. However in forward mode we can hit 65K IOPs and have a stead >> state near 30K IOPs. I'm linking two graphs to show what I'm describing > (for >> some reason the graphs seem to be half of what is reported by ceph -w). >> [2][3] >> > > I don't know if your lower performance is due to unwanted promotions to > cache or if you are seeing something else. I have found that the way the > cache logic currently works unless the bulk of your working set fits in the > cache tier the overhead of the promotions/flushes/evictions can cause a > significant penalty. This is especially true if you are doing IO which is > small compared to the object size. I believe this may be caused by the read > being serviced after the promotion, rather than the read being served from > the base tier and then promoted async. > >> Does the promote/evict logic really add that much latency? It seems that >> overall the tier performance can be very good. We are using three hit sets >> with 10 minutes per set and all three sets have to have a read to promote > it >> (we don't want to promote isolated reads). Does someone have some >> suggestions from getting the forward like performance in writeback? > > When you say you are using 3 hit sets and require 3 reads to promote, is > this via the min_read_recency variable? My understanding was that if set to > 3 it will promote if it finds a hit in any of the last 3 hitsets. Although > the description isn't that clear in the documentation, but looking through > the code seems to support this. If you have found a way to only promote when > there is a hit in all 3 hitsets I would be very interested in hearing about > it as it would be very useful to me. > >> >> We have 35 1 TB Micron M600 drives ( 26K single thread direct sync 4K >> random writes, 43K two thread test, we are already aware of the potential >> power loss issue so you don't need to bring that up) in 3x replication. > Our >> current hot set is about 4.5TB and only shifts by about 30% over a week's >> time. We have cache_target_full_ratio set to >> 0.55 so that we leave a good part of the drive empty for performance. >> Also about 90% of our reads are in 10% of the working set and 80% of our >> writes are in about 20% of the working set. >> >> [1] http://docs.ceph.com/docs/master/rados/operations/cache- >> tiering/#removing-a-writeback-cache >> [2] http://robert.leblancnet.us/files/performance.png >> [3] http://robert.leblancnet.us/files/promote_evict.png >> >> >> Thanks, >> - ---------------- >> Robert LeBlanc >> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 ----- >> BEGIN PGP SIGNATURE----- >> Version: Mailvelope v1.2.3 >> Comment: https://www.mailvelope.com >> >> wsFcBAEBCAAQBQJWS8qbCRDmVDuy+mK58QAADboQAL0tl1ZArL1zPFBf5lYh >> xuYQyaWsoaOgdPvlsFhciSrh3VmdTkT9R3O6MZ61VEauKUHmoipE39KejPj3 >> dQMKKHYc+6VF1MoNoQbeml63jC3DJGBDhPOd+bQ7RE8GBaKM71JaWvvG5 >> bgW >> xLAZ7F+37jpHkp/9syrnb0wMxOtZ0xq/iW8Kt3lvSz5Qx6XNx5r78+H9Zr28 >> OO4xFK8JNfa3JK7RbYU3VeUZCRhhIk/Enb8NdpA0a2cT1meTKfHMDKlOWmT4 >> qrWIfptWdtADveq6xY2Kj92dFXVnwNfFjoIl4PXTwZtZM1RyAc1gy3qMBADI >> BOvn5jdw1PmVYHpY9NH57vpKhn+5o6+FvW95baE5OFJ52NthkVp87LuutnKV >> RNyy/cWEe2/Dc9QZdj3eXKjEcL5MYgM+P21THO2e7QQwD6GXnJWnsSTwsQ >> Om >> Qs6RqyE9RgdpabdThRzxWIuT8TJmBrDOovEulzFpBN3ZG8bsOrS/5pTmgamI >> c8FyddhFgYsPwjMKEDvEbHTIPHx1tZ9hL5fjAwZQeMMCV3LWojAK33a0a602 >> JfSBj1dhICaULUFQT9f9yhd8/maYNpWogHb/zb3wolegcVP0UckcVxNEUxIf >> hxpTvFV93BzUupaprn03Oje2qbSdY++9lZbBfkVChodEprM5oejiT158WBYr >> Z3Af >> =FP/u >> -----END PGP SIGNATURE----- >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com