Re: GlusterFS as virtual machine storage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 8/23/2017 10:44 PM, Pavel Szalbot wrote:
Hi,

On Thu, Aug 24, 2017 at 2:13 AM, WK <wkmail@xxxxxxxxx> wrote:
The default timeout for most OS versions is 30 seconds and the Gluster
timeout is 42, so yes you can trigger an RO event.
I get read-only mount within approximately 2 seconds after failed IO.

Hmm, we don't see that, even on busy VMs.
We ARE using QCOW2 disk images though.

Also, though we no longer use Ovirt, I am still on the list. They are heavy Gluster users and they would be howling if they all had your experience.


Though it is easy enough to raise as Pavel mentioned

# echo 90 > /sys/block/sda/device/timeout
AFAIK this is applicable only for directly attached block devices
(non-virtualized).

No, if you use SATA/IDE emulation (NOT virtio) it is there WITHIN the VM.
We have a lot of legacy VMs from older projects/workloads that have that and we haven't bothered changing them because "they are working fine now"
It is NOT there on virtio.


Likewise virtio "disks" don't even have a timeout value that I am aware of
and I don't recall them being extremely sensitive to disk issues on either
Gluster, NFS or DAS.
We use only virtio and these problems are persistent - temporarily
suspending a node (e.g. HW or Gluster upgrade, reboot) is very scary,
because we often end up with read-only filesystems on all VMs.

However we use ext4, so I cannot comment on XFS.

We use the fuse mount, because we are lazy and haven't upgraded to libgfapi.  I hope to start a new cluster with to libfgapi shortly because of the better performance. Also we use a localhost mount for the gluster driveset on each compute node (i.e. so called hyperconverged). So the only 'gluster' only kit is the lightweight arb box. So those VMs in the gluster 'pool' have a local write and then only 1 off-server write (to the other gluster enabled compute host), which means pretty good performance.

We use the gluster included 'virt' tuning set of:

performance.quick-read=off
performance.read-ahead=off
performance.io-cache=off
performance.stat-prefetch=off
performance.low-prio-threads=32
network.remote-dio=enable
cluster.eager-lock=enable
cluster.quorum-type=auto
cluster.server-quorum-type=server
cluster.data-self-heal-algorithm=full
cluster.locking-scheme=granular
cluster.shd-max-threads=8
cluster.shd-wait-qlength=10000
features.shard=on
user.cifs=off

We do play with shard size and have settled down on 64M, though I've seen recommendations of 128M and 512M for VMs. We didn't really notice much of a difference with any of those as long as they were at least 64M


This discussion will probably end before I migrate VMs from Gluster to
local storage on our Openstack nodes, but I might run some tests
afterwards and keep you posted.

I would be interested in your results. You may also look into Ceph. It is more complicated than Gluster, (well, more complicated than our simple little Gluster arrangement) but the OpenStack people swear by it. It wasn't suited to our needs, but it tested well, when we looked into it last year.

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users




[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux