Performance in VM guests when hosting VM images on Gluster

torbjorn at trollweb.no (Torbjørn Thorsen) · Thu, 28 Feb 2013 11:58:21 +0100

On Wed, Feb 27, 2013 at 9:46 PM, Brian Foster <bfoster at redhat.com> wrote:
> On 02/27/2013 10:14 AM, Torbj?rn Thorsen wrote:
>> I'm seeing less-than-stellar performance on my Gluster deployment when
>> hosting VM images on the FUSE mount.
<snip>
>> If we use a file on the gluster mount as backing for a loop device,
>> and do a sync write:
>> torbjorn at xen01:/srv/ganeti/shared-file-storage/tmp$ sudo dd
>> if=/dev/zero of=/dev/loop1 bs=1024k count=2000 oflag=sync
>> 2097152000 bytes (2.1 GB) copied, 56.3729 s, 37.2 MB/s
>>
>
> What you might want to try is compare each case with gluster profiling
> enabled on your volume (e.g., run a 'gluster ... profile info' to clear
> the interval stats, run your test, run another 'profile info' and see
> how many write requests occurred, divide the amount of data transferred
> by the number of requests).
>
> Running similar tests on a couple random servers around here brings me
> from 70-80MB/s down to 10MB/s over loop. The profile data clearly shows
> that loop is breaking what were previously 128k (max) write requests
> into 4k requests. I don't know enough about the block layer to say why
> that occurs, but I'd be suspicious of the combination of the block
> interface on top of a filesystem (fuse) with synchronous request
> submission (no caching, writes are immediately submitted to the client
> fs). That said, I'm on an older kernel (or an older loop driver anyways,
> I think) and your throughput above doesn't seem to be much worse with
> loop alone...
>
> Brian

I'm not familiar with the profiling feature, but I think I'm seeing
the same thing,
requests being fractured in smaller ones.

However, by chance I found something which seems to impact the
performance even more.
I wanted to retry the dd-to-loop-device-with-sync today, the same one
I pasted yesterday.
However, today it was quite different.

torbjorn at xen01:/srv/ganeti/shared-file-storage/tmp$ sudo dd
if=/dev/zero of=/dev/loop1 bs=1024k count=2000 oflag=sync
303038464 bytes (303 MB) copied, 123.95 s, 2.4 MB/s
^C

So I unmounted the loop device and mounted it again, and re-ran the test.

torbjorn at xen01:/srv/ganeti/shared-file-storage/tmp$ sudo losetup -d /dev/loop1
torbjorn at xen01:/srv/ganeti/shared-file-storage/tmp$ sudo losetup -f
loopback.img
torbjorn at xen01:/srv/ganeti/shared-file-storage/tmp$ sudo dd
if=/dev/zero of=/dev/loop1 bs=1024k count=2000 oflag=sync
2097152000 bytes (2.1 GB) copied, 55.9117 s, 37.5 MB/s

The situation inside the Xen instance was similar, although with
different numbers.

After being on, but mostly idle, for ~5 days.:
torbjorn at hennec:~$ sudo dd if=/dev/zero of=bigfile bs=1024k count=2000
oflag=direct
28311552 bytes (28 MB) copied, 35.1314 s, 806 kB/s
^C

After reboot and a fresh loop device:
torbjorn at hennec:~$ sudo dd if=/dev/zero of=bigfile bs=1024k count=2000
oflag=direct
814743552 bytes (815 MB) copied, 34.7441 s, 23.4 MB/s
^C

These numbers might indicate that loop device performance degrades over time.
However, I haven't seen this on local filesystems, so is this possibly
only with files on Gluster or FUSE ?

I'm on Debian stable, so things aren't exactly box fresh.
torbjorn at xen01:~$ dpkg -l | grep "^ii  linux-image-$(uname -r)"
ii  linux-image-2.6.32-5-xen-amd64      2.6.32-46
Linux 2.6.32 for 64-bit PCs, Xen dom0 support

I'm not sure how to debug the Gluster -> FUSE -> loop device interaction,
but I might try a newer kernel on the client.