> -----Original Message----- > From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of > Nick Fisk > Sent: 15 February 2017 09:53 > To: 'Christian Balzer' <chibi@xxxxxxx>; 'Ceph Users' <ceph- > users@xxxxxxxxxxxxxx> > Subject: Re: bcache vs flashcache vs cache tiering > > > -----Original Message----- > > From: Christian Balzer [mailto:chibi@xxxxxxx] > > Sent: 15 February 2017 01:42 > > To: 'Ceph Users' <ceph-users@xxxxxxxxxxxxxx> > > Cc: Nick Fisk <nick@xxxxxxxxxx>; 'Gregory Farnum' <gfarnum@xxxxxxxxxx> > > Subject: Re: bcache vs flashcache vs cache tiering > > > > On Tue, 14 Feb 2017 22:42:21 -0000 Nick Fisk wrote: > > > > > > -----Original Message----- > > > > From: Gregory Farnum [mailto:gfarnum@xxxxxxxxxx] > > > > Sent: 14 February 2017 21:05 > > > > To: Wido den Hollander <wido@xxxxxxxx> > > > > Cc: Dongsheng Yang <dongsheng.yang@xxxxxxxxxxxx>; Nick Fisk > > > > <nick@xxxxxxxxxx>; Ceph Users <ceph-users@xxxxxxxxxxxxxx> > > > > Subject: Re: bcache vs flashcache vs cache tiering > > > > > > > > On Tue, Feb 14, 2017 at 8:25 AM, Wido den Hollander > > > > <wido@xxxxxxxx> > > > > wrote: > > > > > > > > > >> Op 14 februari 2017 om 11:14 schreef Nick Fisk <nick@xxxxxxxxxx>: > > > > >> > > > > >> > > > > >> > -----Original Message----- > > > > >> > From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] > > > > >> > On Behalf Of Dongsheng Yang > > > > >> > Sent: 14 February 2017 09:01 > > > > >> > To: Sage Weil <notifications@xxxxxxxxxx> > > > > >> > Cc: ceph-devel@xxxxxxxxxxxxxxx; ceph-users@xxxxxxxxxxxxxx > > > > >> > Subject: bcache vs flashcache vs cache tiering > > > > >> > > > > > >> > Hi Sage and all, > > > > >> > We are going to use SSDs for cache in ceph. But I am not > > > > >> > sure which one is the best solution, bcache? flashcache? or > > > > >> > cache > > > > >> tier? > > > > >> > > > > >> I would vote for cache tier. Being able to manage it from > > > > >> within Ceph, instead of having to manage X number of > > > > >> bcache/flashcache instances, appeals to me more. Also last time > > > > >> I looked Flashcache seems unmaintained and bcache might be > > > > >> going that way with talk of this new bcachefs. Another point to > > > > >> consider is that Ceph has had a lot of > > > > work done on it to ensure data consistency; I don't ever want to > > > > be in a position where I'm trying to diagnose problems that might > > > > be being caused by another layer sitting in-between Ceph and the Disk. > > > > >> > > > > >> However, I know several people on here are using bcache and > > > > >> potentially getting better performance than with cache tiering, > > > > >> so > > > > hopefully someone will give their views. > > > > > > > > > > I am using Bcache on various systems and it performs really well. > > > > > The > > > > caching layer in Ceph is slow. Promoting Objects is slow and it > > > > also involves additional RADOS lookups. > > > > > > > > Yeah. Cache tiers have gotten a lot more usable in Ceph, but the > > > > use cases where they're effective are still pretty limited and I > > > > think in-node caching has a brighter future. We just don't like to > > > > maintain the global state that makes separate caching locations > > > > viable and unless you're doing something analogous to the > > > > supercomputing "burst buffers" (which some people are!), it's > > > > going to be hard to beat something that doesn't have to pay the cost of > extra network hops/bandwidth. > > > > Cache tiers are also not a feature that all the vendors support in > > > > their downstream products, so it will probably see less ongoing > > > > investment than you'd expect from such a system. > > > > > > Should that be taken as an unofficial sign that the tiering support is likely > to fade away? > > > > > Nick, you also posted back in October in the "cache tiering deprecated > > in RHCS 2.0" thread and should remember the deafening silence when I > asked that question. > > > > I'm actually surprised that Greg said as much as he did now, > > unfortunately that doesn't really cover all the questions I had back then, in > particular long term support and bug fixes, not necessarily more features. > > > > We're literally about to order our next cluster and cache-tiering works like a > charm for us, even in Hammer. > > With the (still undocumented) knobs in Jewel and read-forward it will be > even more effective. > > > > So given the lack of any statements that next cluster will still use > > the same design as the previous one, since BlueStore isn't ready, bcache > and others haven't been tested here to my satisfaction and we know very > well what works and what not. > > > > So 3 regular (HDD OSD, journal SSD) nodes and 3 cache-tier ones. > > Dedicated cache-tier nodes allow for deployment of high end CPUs only in > those nodes. > > > > Another point in favor of cache-tiering is that it can be added at a > > later stage, while in-node caching requires an initial design with large local > SSDs/NVMes or at least the space for them. > > Because the journal SSDs most people will deploy initially don't tend > > to be large enough to be effective when used with bcache or similar. > > And I think that is the main advantage of tiering, you have a lot more > flexibility both during implementation and further down the road. We can > look at our hot tier hit ratio and just add more SSD's, either in the spare slots > I leave in each chassis or like you have suggested, dedicated SSD nodes. > > > > > > I think both approaches have different strengths and probably the > > > difference between a tiering system and a caching one is what > > causes some of the problems. > > > > > > If something like bcache is going to be the preferred approach, then > > > I think more work needs to be done around certifying it for use > > with Ceph and allowing its behavior to be more controlled by Ceph as > > well. I assume there are issues around backfilling and scrubbing > > polluting the cache? Maybe you would want to be able to pass hints down > from Ceph, which could also allow per pool cache behavior?? > > > > > According to the RHCS release notes back then their idea to achieve > rainbows and pink ponies was using dm-cache. Just an update. I spoke to Sage today and the general consensus is that something like bcache or dmcache is probably the long term goal, but work needs to be done before its ready for prime time. The current tiering functionality won't be going away in the short term and not until there is a solid replacement with bcache/dmcache/whatever. But from the sounds of it, there won't be any core dev time allocated to it. I'm not really too bothered what the solution ends up being, but as we have discussed the flexibility to shrink/grow the cache without having to rebuild all your nodes/OSD's is a major, almost essential, benefit to me. I've still got some ideas which I think can improve performance of the tiering functionality, but unsure as to whether I have the coding skills to pull it off. This might motivate me though to try and improve it in its current form. > > > > Christian > > -- > > Christian Balzer Network/Systems Engineer > > chibi@xxxxxxx Global OnLine Japan/Rakuten Communications > > http://www.gol.com/ > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com