Re: OSD + FlashCache vs. Cache Pool for RBD...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/03/14 23:18, Xavier Trilla wrote:
> -        What do you think is a better approach to improve the
> performance of RBD for VMs: Caching OSDs with FlashCache or using SSD
> Cache Pools?

Well as has been mentioned, Cache Pools isn't available yet however I'm
starting to do some thinking about using FlashCache to give the RBDs a
much-needed kick-along.

We're using Ceph with an in-testing OpenNebula deployment.  At this time
there appears to be little support for this, both on physical hosts and
within VM provisioning systems like OpenStack/OpenNebula.

On my TODO list is to investigate writing such a datastore driver for
OpenNebula.

Sébastian Han did a bit of a guide which goes into setting it up by hand:
http://www.sebastien-han.fr/blog/2012/11/15/make-your-rbd-fly-with-flashcache/

This tells us the idea is at least doable.

I'm thinking something along the lines of using a LVM partition on SSD
as a write-through (or even write-back) cache.

OpenNebula just uses the default RBD format, which from what I
understand, doesn't support COW cloning and such.

I've thus observed with our 3-node cluster (with Intel Core i3 3570T
CPUs, 8GB RAM, 2×3TB HDDs each and 10GB of SSD-based journal per OSD)
making temporary copies of images really does give the cluster a fair
hiding.  CPU load averages reaching 6 or more -- which is getting a bit
much for dual-core CPUs.

The thought I had was along the lines of using the newer RBD format
which does support COW.  Then the logic for provisioning storage would
be thus:

- If the image is non-persistent, make a COW clone of the original,
otherwise use the original image as-is.
- Create a local volume on the SSD LVM partition (ideally equal to the
size of the original image)
- Map the remote RBD device
- Set up flashcache to use the RBD and LVM cache volume to produce a
composite cached device.
- Point the VM at the device produced by flashcache.

The above can probably be applied to OpenStack as well, maybe some hocus
pocus work with ephemeral volumes in nova-volume or some such?

One downside of the above arrangement: I read that support for mapping
newer-format RBDs is only present in fairly recent kernels.  I'm running
Ubuntu 12.04 on the cluster at present with its stock 3.2 kernel.  There
is a PPA for the 3.11 kernel used in Ubuntu 13.10, but if you're looking
at a new deployment it might be better to wait until 14.04: then you'll
get kernel 3.13.

Anyone else have any ideas on the above?

Regards,
-- 
Stuart Longland
Systems Engineer
     _ ___
\  /|_) |                           T: +61 7 3535 9619
 \/ | \ |     38b Douglas Street    F: +61 7 3535 9699
   SYSTEMS    Milton QLD 4064       http://www.vrt.com.au


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux