Just to add, the main reason it seems to make a difference is the metadata updates which lie on the actual OSD. When you are doing small block writes, these metadata updates seem to take almost as long as the actual data, so although the writes are getting coalesced, the actual performance isn't much better. I did a blktrace a week ago, writing 500MB in 64k blocks to an OSD. You could see that the actual data was flushed to the OSD in a couple of seconds, another 30 seconds was spent writing out metadata and doing EXT4/XFS journal writes. Normally I have found flashcache to perform really poorly as it does everything in 4kb blocks, meaning that when you start throwing larger blocks at it, it can actually slow things down. However for the purpose of OSD's you can set the IO cutoff size limit to around 16-32kb and then it should only cache the metadata updates. I'm hoping to do some benchmarks before and after flashcache on a SSD Journaled OSD this week, so will post results when I have them. > -----Original Message----- > From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of > Brendan Moloney > Sent: 23 March 2015 21:02 > To: Noah Mehl > Cc: ceph-users@xxxxxxxxxxxxxx > Subject: Re: OSD + Flashcache + udev + Partition uuid > > This would be in addition to having the journal on SSD. The journal doesn't > help at all with small random reads and has a fairly limited ability to coalesce > writes. > > In my case, the SSDs we are using for journals should have plenty of > bandwidth/IOPs/space to spare, so I want to see if I can get a little more out > of them. > > -Brendan > > ________________________________________ > From: Noah Mehl [noahmehl@xxxxxxxxxxxxxxxxxx] > Sent: Monday, March 23, 2015 1:45 PM > To: Brendan Moloney > Cc: ceph-users@xxxxxxxxxxxxxx > Subject: Re: OSD + Flashcache + udev + Partition uuid > > We deployed with just putting the journal on an SSD directly, why would this > not work for you? Just wondering really :) > > Thanks! > > ~Noah > > > On Mar 23, 2015, at 4:36 PM, Brendan Moloney <moloney@xxxxxxxx> > wrote: > > > > I have been looking at the options for SSD caching for a bit now. Here is my > take on the current options: > > > > 1) bcache - Seems to have lots of reliability issues mentioned on mailing list > with little sign of improvement. > > > > 2) flashcache - Seems to be no longer (or minimally?) > developed/maintained, instead folks are working on the fork enhanceio. > > > > 3) enhanceio - Fork of flashcache. Dropped the ability to skip caching on > sequential writes, which many folks have claimed is important for Ceph OSD > caching performance. (see: https://github.com/stec- > inc/EnhanceIO/issues/32) > > > > 4) LVM cache (dm-cache) - There is now a user friendly way to use dm- > cache, through LVM. Allows sequential writes to be skipped. You need a > pretty recent kernel. > > > > I am going to be trying out LVM cache on my own cluster in the next few > weeks. I will share my results here on the mailing list. If anyone else has > tried it out I would love to hear about it. > > > > -Brendan > > > >> In a long term use I also had some issues with flashcache and enhanceio. > I've noticed frequent slow requests. > >> > >> Andrei > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com