Re: Very poor heal behaviour in 3.7.9

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 27/03/2016 12:33 AM, Lindsay Mathieson wrote:
On 26/03/2016 11:58 PM, Pranith Kumar Karampuri wrote:
Is that the same issue I posted earlier re "gluster volume heal info" appearing to block I/O?

I don't think it is heal info that is blocking I/O. I think it is client triggering heal and block the fop until heal completes that results in this pattern. This data-heal disabling should get you out of this problem.


I tried it earlier and it didn't seem to help.

Does anything need to be restarted after cluster.data-self-heal is set off?


Tried again this morning. 100% replicate the behaviour I noted in

After testing the heal process by killing glusterfsd on a node I noticed the following.

- I/O continued at normal speed while glusterfsd was down.

- After restarting glusterfsd, I/O still continued as normal

- performing a "gluster volume heal datastore2 info" whould show some info then hang.

- I/O on the cluster would cease. e.g in a VM where I was running a command line build of a large project, the build just stopped. The VM itself was mostly responsive but anything that involved accessing the disk hung.

- if I killed the "gluster volume heal datastore2 info" command then I/O in the VM's resumed at a normal pace.

- if I then reissued the "gluster volume heal datastore2 info" command I/O would continue for a short while (seconds - minutes) before hanging again.

- killing the heal info command would resume I/O again.


iowait and cpu are under 4% on all three nodes.

Even after I shutdown all vm's on datastore2 "gluster volume heal datastore2 info" hung indefinitely with no output.

I had to stop/start the datastore2 before the info would work, it rteurned very quickly with:

Brick vnb.proxmox.softlog:/tank/vmdata/datastore2
Number of entries: 0

Brick vng.proxmox.softlog:/tank/vmdata/datastore2
/.shard - Possibly undergoing heal

Number of entries: 1

Brick vna.proxmox.softlog:/tank/vmdata/datastore2
/.shard - Possibly undergoing heal

Number of entries: 1

Unfortunately its stayed that way for 10 minutes now.


I'd like to recheck this behaviour under 3.7.7 - can I just revert to that (debian packages) without recreating the datastore?

thanks,



-- 
Lindsay Mathieson
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux