I've done a bit of testing with Enhanceio on my cluster and I can see a definate improvement in read performance for cached data. The performance increase is around 3-4 times the cluster speed prior to using enhanceio based on large block size IO (1M and 4M). I've done a concurrent test of running a single "dd if=/dev/vda of=/dev/null bs=1M/4M iflag=direct" instance over 20 vms which were running on 4 host servers. Prior to enchanceio i was getting around 30-35MB/s per guest vm regardless of how many times i run the test. With enhanceio (from the second run) I was hitting over 130MB/s per vm. I've not seen any lag in performance of other vms while using enchanceio, unlike a considerable lag without the enchanceio. The ssd disk utilisation was not hitting much over 60%. The small block size (4K) performance hasn't changed with enhanceio, which made me think that the performance of osds themselves is limited when using small block sizes. I wasn't getting much over 2-3MB/s per guest vm. On a contrary, when I tried to use the firefly cache pool on the same hardware, my cluster has performed significantly slower with the cache pool. The whole cluster seemed under a lot more load and the performance has dropped to around 12-15MB/s and other guest vms were very very slow. The ssd disks were utilised 100% all the time during the test with majority of write IO. I admit that these tests shouldn't be considered as a definate and fully performance tests of ceph cluster as this is a live cluster with disk io actiivity outside outside of the test vms. The average load is not much (300-500 IO/s), mainly reads. However, it still indicates that there is a room for improvement in the ceph's cache pool implementation. Looking at my results, I think ceph is missing a lot of hits on the read cache, which causes osds to write a lot of data. With enchanceio I was getting well over 50% read hit ratio and the main activity on the ssds was read io unlike ceph. Outside of the tests, i've left enchanceio running on the osd servers. It has been a few days now and the hit ratio on the osds is around 8-11%, which seems a bit low. I was wondering if I should change the default block size of enchance io to 2K instead of the default 4K. Taking into account's ceph object size of 4M I am not sure if this will help the hit ratio. Does anyone have an idea? Andrei ----- Original Message ----- > From: "Mark Nelson" <mark.nelson at inktank.com> > To: "Robert LeBlanc" <robert at leblancnet.us>, "Mark Nelson" > <mark.nelson at inktank.com> > Cc: ceph-users at lists.ceph.com > Sent: Monday, 22 September, 2014 10:49:42 PM > Subject: Re: Bcache / Enhanceio with osds > Likely it won't since the OSD is already coalescing journal writes. > FWIW, I ran through a bunch of tests using seekwatcher and blktrace > at > 4k, 128k, and 4m IO sizes on a 4 OSD cluster (3x replication) to get > a > feel for what the IO patterns are like for the dm-cache developers. I > included both the raw blktrace data and seekwatcher graphs here: > http://nhm.ceph.com/firefly_blktrace/ > there are some interesting patterns but they aren't too easy to spot > (I > don't know why the Chris decided to use blue and green by default!) > Mark > On 09/22/2014 04:32 PM, Robert LeBlanc wrote: > > We are still in the middle of testing things, but so far we have > > had > > more improvement with SSD journals than the OSD cached with bcache > > (five > > OSDs fronted by one SSD). We still have yet to test if adding a > > bcache > > layer in addition to the SSD journals provides any additional > > improvements. > > > > Robert LeBlanc > > > > On Sun, Sep 14, 2014 at 6:13 PM, Mark Nelson > > <mark.nelson at inktank.com > > <mailto:mark.nelson at inktank.com>> wrote: > > > > On 09/14/2014 05:11 PM, Andrei Mikhailovsky wrote: > > > > Hello guys, > > > > Was wondering if anyone uses or done some testing with using > > bcache or > > enhanceio caching in front of ceph osds? > > > > I've got a small cluster of 2 osd servers, 16 osds in total and > > 4 ssds > > for journals. I've recently purchased four additional ssds to be > > used > > for ceph cache pool, but i've found performance of guest vms to be > > slower with the cache pool for many benchmarks. The write > > performance > > has slightly improved, but the read performance has suffered a > > lot (as > > much as 60% in some tests). > > > > Therefore, I am planning to scrap the cache pool (at least until it > > matures) and use either bcache or enhanceio instead. > > > > > > We're actually looking at dm-cache a bit right now. (and talking > > some of the developers about the challenges they are facing to help > > improve our own cache tiering) No meaningful benchmarks of dm-cache > > yet though. Bcache, enhanceio, and flashcache all look interesting > > too. Regarding the cache pool: we've got a couple of ideas that > > should help improve performance, especially for reads. There are > > definitely advantages to keeping cache local to the node though. I > > think some form of local node caching could be pretty useful going > > forward. > > > > > > Thanks > > > > Andrei > > > > > > _________________________________________________ > > ceph-users mailing list > > ceph-users at lists.ceph.com <mailto:ceph-users at lists.ceph.com> > > http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com > > <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> > > > > > > _________________________________________________ > > ceph-users mailing list > > ceph-users at lists.ceph.com <mailto:ceph-users at lists.ceph.com> > > http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com > > <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> > > > > > _______________________________________________ > ceph-users mailing list > ceph-users at lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140923/f98bb254/attachment.htm>