Some additional details if it helps, there is no cache on the disk, it's virtio and iothread=1. The file is in qcow and using qemu-img check it says it's not corrupted, but when the VM is running I have I/O Errors. As you can see in the config, performance.stat-prefetch: off but being on a debian system I don't have the virt group, I just tried to replicate the different settings by hand. Maybe I forgot something. Thanks ! On Wed, May 18, 2016 at 07:11:08PM +0530, Krutika Dhananjay wrote: > Hi, > > I will try to recreate this issue tomorrow on my machines with the steps > that Lindsay provided in this thread. I will let you know the result soon > after that. > > -Krutika > > On Wednesday, May 18, 2016, Kevin Lemonnier <lemonnierk@xxxxxxxxx> wrote: > > Hi, > > > > Some news on this. > > Over the week end the RAID Card of the node ipvr2 died, and I thought > > that maybe that was the problem all along. The RAID Card was changed > > and yesterday I reinstalled everything. > > Same problem just now. > > > > My test is simple, using the website hosted on the VMs all the time > > I reboot ipvr50, wait for the heal to complete, migrate all the VMs off > > ipvr2 then reboot it, wait for the heal to complete then migrate all > > the VMs off ipvr3 then reboot it. > > Everytime the first database VM (which is the only one really using the > disk > > durign the heal) starts showing I/O errors on it's disk. > > > > Am I really the only one with that problem ? > > Maybe one of the drives is dying too, who knows, but SMART isn't saying > anything .. > > > > > > On Thu, May 12, 2016 at 04:03:02PM +0200, Kevin Lemonnier wrote: > >> Hi, > >> > >> I had a problem some time ago with 3.7.6 and freezing during heals, > >> and multiple persons advised to use 3.7.11 instead. Indeed, with that > >> version the freez problem is fixed, it works like a dream ! You can > >> almost not tell that a node is down or healing, everything keeps working > >> except for a little freez when the node just went down and I assume > >> hasn't timed out yet, but that's fine. > >> > >> Now I have a 3.7.11 volume on 3 nodes for testing, and the VM are proxmox > >> VMs with qCow2 disks stored on the gluster volume. > >> Here is the config : > >> > >> Volume Name: gluster > >> Type: Replicate > >> Volume ID: e4f01509-beaf-447d-821f-957cc5c20c0a > >> Status: Started > >> Number of Bricks: 1 x 3 = 3 > >> Transport-type: tcp > >> Bricks: > >> Brick1: ipvr2.client:/mnt/storage/gluster > >> Brick2: ipvr3.client:/mnt/storage/gluster > >> Brick3: ipvr50.client:/mnt/storage/gluster > >> Options Reconfigured: > >> cluster.quorum-type: auto > >> cluster.server-quorum-type: server > >> network.remote-dio: enable > >> cluster.eager-lock: enable > >> performance.quick-read: off > >> performance.read-ahead: off > >> performance.io-cache: off > >> performance.stat-prefetch: off > >> features.shard: on > >> features.shard-block-size: 64MB > >> cluster.data-self-heal-algorithm: full > >> performance.readdir-ahead: on > >> > >> > >> As mentioned, I rebooted one of the nodes to test the freezing issue I > had > >> on previous versions and appart from the initial timeout, nothing, the > website > >> hosted on the VMs keeps working like a charm even during heal. > >> Since it's testing, there isn't any load on it though, and I just tried > to refresh > >> the database by importing the production one on the two MySQL VMs, and > both of them > >> started doing I/O errors. I tried shutting them down and powering them > on again, > >> but same thing, even starting full heals by hand doesn't solve the > problem, the disks are > >> corrupted. They still work, but sometimes they remount their partitions > read only .. > >> > >> I believe there is a few people already using 3.7.11, no one noticed > corruption problems ? > >> Anyone using Proxmox ? As already mentionned in multiple other threads > on this mailing list > >> by other users, I also have pretty much always shards in heal info, but > nothing "stuck" there, > >> they always go away in a few seconds getting replaced by other shards. > >> > >> Thanks > >> > >> -- > >> Kevin Lemonnier > >> PGP Fingerprint : 89A5 2283 04A0 E6E9 0111 > > > > > > > >> _______________________________________________ > >> Gluster-users mailing list > >> Gluster-users@xxxxxxxxxxx > >> http://www.gluster.org/mailman/listinfo/gluster-users > > > > > > -- > > Kevin Lemonnier > > PGP Fingerprint : 89A5 2283 04A0 E6E9 0111 > > -- Kevin Lemonnier PGP Fingerprint : 89A5 2283 04A0 E6E9 0111
Attachment:
signature.asc
Description: Digital signature
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users