This is further to my earlier posts on "Very poor heal behaviour in
3.7.9", same test environment.
After testing the heal process by killing glusterfsd on a node I noticed
the following.
- I/O continued at normal speed while glusterfsd was down.
- After restarting glusterfsd, I/O still continued as normal
- performing a "gluster volume heal datastore2 info" whould show some
info then hang.
- I/O on the cluster would cease. e.g in a VM where I was running a
command line build of a large project, the build just stopped. The VM
itself was mostly responsive but anything that involved accessing the
disk hung.
- if I killed the "gluster volume heal datastore2 info" command then I/O
in the VM's resumed at a normal pace.
- if I then reissued the "gluster volume heal datastore2 info" command
I/O would continue for a short while (seconds - minutes) before hanging
again.
- killing the heal info command would resume I/O again.
This looks like some sort of deadlock bug. The heal info command was
optimisied for 3.7.8/3.7.9 wasn't it?
thanks,
--
Lindsay Mathieson
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users