"Granular locking" - does this need to be enabled in 3.3.0 ?

anand.avati at gmail.com (Anand Avati) · Thu, 19 Jul 2012 08:13:36 -0700

On Thu, Jul 19, 2012 at 2:14 AM, Jake Grimmett <jog at mrc-lmb.cam.ac.uk>wrote:

> Dear Pranith /Anand ,
>
> Update on our progress with using KVM & Gluster:
>
> We built a two server (Dell R710) cluster, each box has...
>  5 x 500 GB SATA RAID5 array (software raid)
>  an Intel 10GB ethernet HBA.
>  One box has 8GB RAM, the other 48GB
>  both have 2 x E5520 Xeon
>  Centos 6.3 installed
>  Gluster 3.3 installed from the rpm files on the gluster site
>
>
> 1) create a replicated gluster volume (on top of xfs)
> 2) setup qemu/kvm with a gluster volume (mounts localhost:/gluster-vol)
> 3) sanlock configured (this is evil!)
> 4) build a virtual machines with 30GB qcow2 image, 1GB RAM
> 5) clone this VM into 4 machines
> 6) check that live migration works (OK)
>
> Start basic test cycle:
> a) migrate all machines to host #1, then reboot host #2
> b) watch logs for self-heal to complete
> c) migrate VM's to host #2, reboot host #1
> d) check logs for self heal
>
> The above cycle can be repeated numerous times, and completes without
> error, provided that no (or little) load is on the VM.
>
>
> If I give the VM's a work load, such by running "bonnie++" on each VM,
> things start to break.
> 1) it becomes almost impossible to log in to each VM
> 2) the kernel on each VM starts giving timeout errors
> i.e. "echo 0 > /proc/sys/kernel/hung_task_**timeout_secs"
> 3) top / uptime on the hosts shows load average of up to 24
> 4) dd write speed (block size 1K) to gluster is around 3MB/s on the host
>
>
> While I agree that running bonnie++ on four VM's is possibly unfair, there
> are load spikes on quiet machines (yum updates etc). I suspect that the I/O
> of one VM starts blocking that of another VM, and the pressure builds up
> rapidly on gluster - which does not seem to cope well under pressure.
> Possibly this is the access pattern / block size of qcow2 disks?
>
> I'm (slightly) disappointed.
>
> Though it doesn't corrupt data, the I/O performance is < 1% of my
> hardwares capability. Hopefully work on buffering and other tuning will fix
> this ? Or maybe the work mentioned getting qemu talking directly to gluster
> will fix this?
>
>
Do you mean that the I/O is bad when you are performing the migration? Or
bad in general? If it is bad in general the qemu driver should help. Also
try presenting each VM a FUSE mount point of its own (we have seen that
help improve the overall system IOPs)
If it is slow performance only during failover/failback, we probably need
to do some more internal QoS tuning to de-prioritize self-heal traffic from
preempting VM traffic for resources.

Avati
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gluster.org/pipermail/gluster-users/attachments/20120719/d1259067/attachment-0001.htm>