Hi, I recently installed a 3.7.12 gluster to replace our 3.7.6 in production. I started moving a few VMs on it, it's accessed using NFS (not ganesha, as it's debian). Last night one of the nodes rebooted, I wasn't the one on call but I can see in our monitoring logs that the website hosted on the production VM I moved on the cluster stopped responding for 9 minutes. It really really looks like the problem we have with 3.7.6 where the VMs are freezing during the heals. Can someone confirm that 3.7.12 shouldn't be freezing the shards waiting for a heal, only the shards being actively healed ? In the VM's console I can see the warning about hung task for more than 120 seconds, which does seem to indicate that the VM was frozen for a while. It's a simple debian 8 on proxmox with a virtIO disk on an NFS gluster volume. Here is the volume config, if that matters : Options Reconfigured: cluster.data-self-heal-algorithm: full features.shard-block-size: 64MB features.shard: on performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off cluster.eager-lock: enable network.remote-dio: enable cluster.server-quorum-type: server cluster.quorum-type: auto performance.readdir-ahead: on Thanks -- Kevin Lemonnier PGP Fingerprint : 89A5 2283 04A0 E6E9 0111
Attachment:
signature.asc
Description: Digital signature
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users