about HA infrastructure for hypervisors

B.Candler at pobox.com (Brian Candler) · Thu, 28 Jun 2012 18:02:48 +0100

On Thu, Jun 28, 2012 at 10:40:43AM -0500, Nathan Stratton wrote:
> But wait, yes, I have 16 physical disks, but I am running distribute
> + replicate so the 8 physical boxes are broken up into 4 pairs of
> redundant boxes. When I do a write, I am writing on two servers, or
> 4 physical disks. So in my case, 31.1 MB/s vs about 200 MB/s native
> is not that bad.
> 
> DRDB is MUCH faster, but your not comparing apples to apples. DRBD
> has worked great for me in the past when I only needed two storage
> nodes to be mirrored in active/active, but as soon as you grow past
> that you need to look at something like Gluster.

But we're talking different things here:

* VM image (i.e. the root filesystem it boots from; where the O/S sits;
  logs and scratch space)
* Application data storage

You'd be mad to have terabytes of data sitting inside a single VM image
file.  It's unshareable, the VM image is one big humungous blob, and to back
it up effectively you need to run the backup tools within the VM itself.

Furthermore, the performance of GlusterFS is excellent when you mount it
directly.  It only sucks when you're using a gluster-mounted file as a KVM
virtual disk.

So what I'm suggesting is, if you need performance today:

* Use DRBD+LVM for your VM filesystem storage
* Use glusterfs for your "big data", and attach it to those VM(s) which
need to access it - leveraging the naturally shared nature of glusterfs.

And eventually you'll be able to simplify your system by migrating your VM
images to glusterfs, when performance catches up.

Ganeti can manage both types of cluster, so you don't lose out by learning
it up-front.

Regards,

Brian.