On Fri, 3 Oct 2014 11:24:38 +0100 (BST) Andrei Mikhailovsky wrote: > From: "Christian Balzer" <chibi@xxxxxxx> > > > To: ceph-users@xxxxxxxxxxxxxx > > Sent: Friday, 3 October, 2014 2:06:48 AM > > Subject: Re: ceph, ssds, hdds, journals and caching > > > On Thu, 2 Oct 2014 21:54:54 +0100 (BST) Andrei Mikhailovsky wrote: > > > > Hello Cephers, > > > > > > I am a bit lost on the best ways of using ssd and hdds for ceph > > > cluster > > > which uses rbd + kvm for guest vms. > > > > > > At the moment I've got 2 osd servers which currently have 8 hdd > > > osds > > > (max 16 bays) each and 4 ssd disks. Currently, I am using 2 ssds > > > for osd > > > journals and I've got 2x512GB ssd spare, which are waiting to be > > > utilised. I am running Ubuntu 12.04 with 3.13 kernel from Ubuntu > > > 14.04 > > > and the latest firefly release. > > > > > In case you're planning to add more HDDs to those nodes, the obvious > > use > > case for those SSDs would be additional journals. > > From what i've seen so far, the two ssds that i currently use for > journaling are happy serving 8 osds and I do not have much load on them. > Having more osds per server might change that though, you are right. But > at the moment I was hoping to improve the read performance, especially > for small block sizes, hense I was thinking of adding the caching layer. > > > Also depending on your use case, a kernel newer than 3.13(which also > > is > > not getting any upstream updates/support) might be a good idea. > > Yes, indeed. I am considering the latest supported kernels from Ubuntu > team > > > > I've tried to use ceph cache pool tier and the results were not > > > good. My > > > small cluster slowed down by quite a bit and i've disabled the > > > cache > > > tier altogether. > > > > > Yeah, this feature is clearly a case of "wait for the next major > > release > > or the one after that and try again". > > Anyone know if the latest 0.80.6 firefly improves the cache behaviour? > I've seen a bunch of changes in the cache tiering, however, I am not > sure if these are addressing the stability of the tier or its > efficiency? > Not a Ceph developer, but I think these were bug fixes for the most part. I wouldn't expect major (invasive code changes) improvements before a future release (and with future I mean probably the next one over). > > > My question is how would one utilise the ssds in the best manner to > > > achieve a good performance boost compared to a pure hdd setup? > > > Should I > > > enable block level caching (likes of bcache or similar) using all > > > my > > > ssds and do not bother using ssd journals? Should I keep the > > > journals on > > > two ssds and utilse the remaining two ssds for bcache? Or is there > > > a > > > better alternative? > > > > > This has all been discussed very recently here and the results where > > inconclusive at best. In some cases reads were improved, but for > > writes it > > was potentially worse than normal Ceph journals. > > > Have you monitored your storage nodes (I keep recommending atop for > > this) > > during a high load time? If your SSDs are becoming the bottleneck and > > not > > the actual disks (doubtful, but verify), more journals. > > I am monitoring my ceph cluster with Zabbix and I do not have a > significant load on the servers at all. While I doubt you're hitting any particular bottlenecks on your storage servers I don't think Zabbix (very limited experience with it so I might be wrong) monitors everything, nor does it so at sufficiently high freqency to show what is going on during a peak or fio test from a client. Thus my suggestion to stare at it live with atop (on all nodes). > My biggest concern is the single > thread performance of vms. From what I can see, this is the main > downside of ceph. On average, I am not getting much over 35-40MB/s per > thread in cold data reads. This is compared with a single hdd read > performance of 150-160MB/s. Having about 1/4 of the raw device > performance is a bit worring, especially compared with what i've read. I > should be getting about 1/2 of the raw drive performance for a single > thread, but I am not. My hope was with caching tier I can increase it. > Have a look at: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2014-April/028552.html Your numbers look very much like mine before increasing the read_ahead buffer. > > Other than that, maybe create a 1TB (usable space) SSD pool for > > guests > > with special speed requirements... > > I am planning to do this for the database volumes, however, from what > I've read so far, there are performance bottlenecks and the current > stable firefly is not optimised for ssds. I've not tried it myself, but > it doesn't look like having a dedicated ssd pool will bring a > significant increase in performance. > It will be faster than HDDs and also has future potential for improvement, but don't expect miracles indeed. If in doubt, just test it. ^^ Christian > Has anyone tried using bcache of dm-cache with ceph? Any tips on how to > integrate it? From what I've read so far, they require you to format the > existing hdd, which is not feasible if you have an existing live > cluster. > > Cheers > > > Christian > > -- > > Christian Balzer Network/Systems Engineer > > chibi@xxxxxxx Global OnLine Japan/Fusion Communications > > http://www.gol.com/ > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Fusion Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com